Thursday 25 February 2016

Can Hash map be used in a multithreaded environment ?

Another very popular interview question.

Well the answer to this is NO.

The basic reason for this is that since hashmap being non-synchronized would not guarantee us atomicity and fail in put if absent type scenarios.

But apart from this is another major problem. The problem of hashmap getting stuck in infinite loop.

To understand this we need to brush up a bit on how hash map works in java (click here).

Now since we all must have understood by now that hash map is backed by an array and hash map is ideally supposed to maintain a get and put of complexity o(1).

Even in the most ideal scenario this can happen only when the Entry array has only 1 Entry object per location or in other words the number of Entry objects must be equal to the array length(16 initially).

Now since we all must have experienced that hash map can(infact mostly does) store more than 16 values.

So does it mean that after adding 16 Entry objects it fails to maintain the complexity of o(1) ?

Well not really.

Hash Map internally uses the logic that when the number of Entry Objects exceeds a certain value it recreats the array(twice the size), copies the existing elements from previous map to new map(by a concept called re-hashing).

Now that certain valueis know as load factor.

Now let us take a scenario.

There are two threads T1 and T2 which are trying to put a key in the map when it has reached its load factor.

The current state of the hash map is


In the above image we have assumed that 1, 2 and 3 have the same hash keys so they are in the same array location. Also we have assumed that there are 13 other Entry objects (not shown in figure.

Now let us see what happens when these thread try to alter the Map.

When Thread T1 access the map it creates a new array and starts to copy all the component in the new array. and inserts the first Entry(1,Test1) and next point to Entry(2,Test2) whose next is Entry(3,Test3).

Now the current structure of the hash bucket (singly linked list) at the array location (given in the above figure) is

Entry(1,Test1) next is Entry(2,Test2) for which next is Entry(3,Test3).

Now let as assume at the same time another thread T2 tries to write to map. Since the put method is not thread safe T2 would be allowed access to it.

When T2 would be accessing put method, again upon finding that the map is full would make a new array and start copying to that.

To copy the above hashbucket(linked list) the Thread to avoid transversing the list again and again inserts it in a reverse order that is the new state of array is:
Now lets come back to Thread T1.

Now it tries to insert Entry(2,Test2)(next for which should be Entry(3,Test3) but is Entry(1,Test1), as changed by T2. Now it will keep on looping between Entry(2,Test2) and  Entry(1,Test1) infinitely.

No comments:

Post a Comment