Theory speaking, looking into hashmap (isset) is faster than array_search or in_array.

But, let’s see how it work in php in practice.

Look at this code block:

    function detectGender($keyword)
    {
        $gender = 0;
        $genderKeywords = [
            'MY' => [
                1 => 'men,male',
                2 => 'women,female',
            ],
            'SG' => [
                1 => 'men,male',
                2 => 'women,female',
            ],
            'HK' => [
                1 => 'men,male',
                2 => 'women,female',
            ],
            'PH' => [
                1 => 'men,male',
                2 => 'women,female',
            ],
            'TH' => [
                1 => 'ชาย,ผชาย',
                2 => 'หญง,ผหญง',
            ],
            'VN' => [
                1 => 'nam',
                2 => 'nu',
            ],
            'ID' => [
                1 => 'pria',
                2 => 'wanita',
            ],
        ];
        foreach ($genderKeywords['VN'] as $key => $keywords) {
            if (array_search($keyword, explode(',', $keywords)) !== false) {
                return $key;
            }
        }
        return $gender;
    }
    
    print_r(detectGender('nam'));

Given a keyword in a language, then the function return the gender code (1: men, 2: women).

Make thing done with above code, yeah! But can we optimize it a bit: faster, and easier to read??

Why not? So, take a look at this block of code:

        foreach ($genderKeywords[Option::getCc()] as $key => $keywords) {
            if (array_search($keyword, explode(',', $keywords)) !== false) {
                return $key;
            }
        }

It takes O(n*logn) for time complexity!

if we change the data structure of it a bit like this

        $defaultMapping = [
                'men' => 1,
                'male' => 1,
                'women' => 2,
                'female' => 2
        ];

        $genderKeywords = [
            'MY' => $defaultMapping,
            'SG' => $defaultMapping,
            'HK' => $defaultMapping,
            'PH' => $defaultMapping,
            'TH' => [
                'ชาย' => 1,
                'ผชาย' => 1,
                'หญง' => 2,
                'ผหญง' => 2,
            ],
            'VN' => [
                'nam' => 1,
                'nu' => 2,
            ],
            'ID' => [
                'pria' => 1,
                'wanita' => 1,
            ],
        ];

$genderKeywords array now acts as a hashmap[1]. Which O(1) for looking the key in it. Finally, check it out

function detectGender($keyword) {
        $defaultMapping = [
            'men' => 1,
            'male' => 1,
            'women' => 2,
            'female' => 2
        ];

        $genderKeywords = [
            'MY' => $defaultMapping,
            'SG' => $defaultMapping,
            'HK' => $defaultMapping,
            'PH' => $defaultMapping,
            'TH' => [
                'ชาย' => 1,
                'ผชาย' => 1,
                'หญง' => 2,
                'ผหญง' => 2,
            ],
            'VN' => [
                'nam' => 1,
                'nu' => 2,
            ],
            'ID' => [
                'pria' => 1,
                'wanita' => 1,
            ],
        ];

        return $genderKeywords['VN'][$keyword] ?? 0;
}

It’s O(1) for time complexity, and easier to read, isn’t it ??

Testing performance:

optimize way: https://3v4l.org/Iavgv/perf#output

normal way: https://3v4l.org/okDoD/perf#output

Well, the optimize way in theory is not faster than the normal way in practice :D. So, at this point, we can’t see much improvement here, but at least the code of new implementation (hashmap way) is cleaner.

So, theory isn’t always perform well in practice! is it?

References:

-[1] https://nikic.github.io/2014/12/22/PHPs-new-hashtable-implementation.html