#Question about Cache lines + HashMap

1 messages · Page 1 of 1 (latest)

next granite
#

When accessing an Array, arr[0], then addresses close to it (arr[1], arr[2], arr[3], ...) will be loaded into the cache line as well.

But how does it work with a HashMap, does the same thing happen? Or is it the same for all Collections?

If so, then a lot of cache misses would occur when accessing a HashMap not in an orderly way, right?

-edit-
I have read that a typical cache line is 64 Bytes long (but can be 32-128 Bytes too).
So if we have a struct > 64 Bytes, then the Cache Line won't be able to load the adjacent addresses as well. Is this the exact reason as to why we are advised to split our data into multiple Components and not to put everything together like in OOP?

wraith fossil
#

A hash map is probably the worst container for cache coherency, because it uses a hash of the key (which is pseudo-unique for each item), resulting in random memory access

#

How exactly that works is very dependent on the implementation as there are many ways to do it, usually you would have a flat array of the buckets, then the buckets could be flat arrays, linked lists, or something else...

#

Generally you would only want to use a hash map when you have a large number of items because when you have a small number of items it can be faster to just check every item in an array, however convenience usually wins and as long as you don't use them in inner loops you'll be fine most of the time.

next granite
#

I see, thanks for clearing things up!

next granite
#

Just asking out of curiosity, since ultimately it all depends on the use case

wraith fossil
#

I would guess 100s, but really you should just use the hash map and only change it if it's a hot spot in the profiler

next granite
#

Yeah you are right, since using the hash map would make programming way easier

severe quest
#

there are benchmarks public