Hi there! Yesterday I tried to use this crate as a drop-in replaceme

`entry()` API is slow about hashbrown HOT 4 CLOSED

rust-lang commented on July 2, 2024

`entry()` API is slow

from hashbrown.

Comments (4)

Amanieu commented on July 2, 2024

I think this is just a case of some operations not being properly inlined. The solution is to throw a bunch of #[inline] into map.rs and set.rs.

from hashbrown.

LukasKalbertodt commented on July 2, 2024

Thanks for the quick reply and fix!

I just tried my original test with the your master branch and indeed, the entry() version is faster now 🎉 (2.06s vs 2.35s in my original test; test environment is slightly different than in my first post, so those numbers are not comparable). In the test benchmark I posted above, I get the following results now:

test brown_entry  ... bench:  10,416,253 ns/iter (+/- 6,420,959)
test brown_manual ... bench:  16,704,658 ns/iter (+/- 8,961,890)
test fnv_entry    ... bench:  43,876,666 ns/iter (+/- 27,247,781)
test fnv_manual   ... bench:  39,413,669 ns/iter (+/- 22,134,246)
test std_entry    ... bench:  68,217,384 ns/iter (+/- 32,603,154)
test std_manual   ... bench:  86,265,556 ns/iter (+/- 32,434,485)

However, I think that there is still some improvement, at least for the VacantEntry case. (But I only skimmed the code, so I might be very wrong). In this line:

map.entry(key).or_insert_with(|| produce_value())

Let's assume key is not yet in the map.

First entry is called. It calculates the hash, performs the lookup and creates a corresponding Entry value. Then, or_insert_with is executed (which basically only calls VacantEntry::insert). This leads to RawTable::insert. And here is where I think redundant work is done: this last insert function performs the lookup again. The hash is only calculated once, because it is stored in VacantEntry. But couldn't the bucket index also be stored in VacantEntry? That way the insert operation wouldn't need to search for a slot to insert. Right?

Or is it just not worth it caching the index? I only have limited knowledge of the underlying data structure :/

Anyway, thanks again!

from hashbrown.

Amanieu commented on July 2, 2024

This is not possible because of the way lookups work. Since we scan groups of 16 elements in parallel using SIMD, we don't get a bucket index when a lookup fails. We have to perform a second scan which looks for slots that are either EMPTY or DELETED.

from hashbrown.

LukasKalbertodt commented on July 2, 2024

Ah ok, that makes sense. Thank you :)

from hashbrown.

Recommend Projects

`entry()` API is slow about hashbrown HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent