Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up from_bitmap_bytes, and use less unsafe #1

Merged

Conversation

Dr-Emann
Copy link

@Dr-Emann Dr-Emann commented Sep 7, 2024

  • Directly create an array/bitmap store based on the count of bits rather than always making a bitmap, and converting to the correct store after
  • Use u64 words to count bits and enumerate bits (this appears to make a pretty big difference)
  • Store in little endian, then just fixup any touched words
  • Only use unsafe to reinterpret a whole bitmap container as bytes, everything else is safe
  • Allow adding bytes up to the last possible offset

On my machine (an m1 mac), this speeds up the benchmarks by between 16%-120% (probably based on how sparse the data is)

See RoaringBitmap#288

* Directly create an array/bitmap store based on the count of bits
* Use u64 words to count bits and enumerate bits
* Store in little endian, then just fixup any touched words
* Only use unsafe to reinterpret a whole bitmap container as bytes, everything
  else is safe
* Allow adding bytes up to the last possible offset
@Dr-Emann Dr-Emann force-pushed the dremann/from-bitmap-bytes branch from 9afd3d6 to 16e39fc Compare September 7, 2024 15:06
we can setting an initial value in that case
@Dr-Emann Dr-Emann force-pushed the dremann/from-bitmap-bytes branch from 16e39fc to ed8f0cb Compare September 7, 2024 15:08
@lemolatoon
Copy link
Owner

lemolatoon commented Sep 12, 2024

@Dr-Emann Thank you for sending patch! I have just reviewed your code.

  • It is surprising that using u64::count_ones speeds up so much. Nice.
  • Defining from_lsb0_bytes for each store makes much sense.
  • Converting endians after writing bytes seems to be good, making code clearer.

I'll merge this.

@lemolatoon lemolatoon merged commit aace6b8 into lemolatoon:from-bitmap-bytes Sep 12, 2024
@Dr-Emann Dr-Emann deleted the dremann/from-bitmap-bytes branch September 12, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants