-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(NODE-5934): replace DataView uses with bit math #649
Conversation
05b9dec
to
6e70471
Compare
Running on: Node.js v20.11.0
OS: linux
CPUs: 8
Arch: x64
RAM: 33.105743872 GB 64-bit floats show improvements, tested a single value, and 1000 values: =================== testing with { _id: 120.384 } bytes 18
do_nothing x 852,087,149 ops/sec ±0.04% (396 runs sampled)
bson_deserialize_current_main x 3,509,120 ops/sec ±0.11% (395 runs sampled)
bson_deserialize_bits_pr x 6,246,320 ops/sec ±0.39% (391 runs sampled)
bson_serialize_current_main x 2,481,998 ops/sec ±0.39% (391 runs sampled)
bson_serialize_bits_pr x 2,691,455 ops/sec ±0.14% (393 runs sampled) =================== testing with an object with 1000 keys all set to 120.384, bytes 12895
do_nothing x 852,260,397 ops/sec ±0.04% (396 runs sampled)
bson_deserialize_current_main x 15,682 ops/sec ±0.08% (395 runs sampled)
bson_deserialize_bits_pr x 15,014 ops/sec ±0.06% (395 runs sampled)
bson_serialize_current_main x 10,893 ops/sec ±0.22% (395 runs sampled)
bson_serialize_bits_pr x 16,187 ops/sec ±0.11% (395 runs sampled) 64-bit integers also improved =================== testing with { _id: -120n } bytes 18
do_nothing x 852,427,659 ops/sec ±0.04% (396 runs sampled)
bson_deserialize_current_main x 2,602,237 ops/sec ±0.09% (394 runs sampled)
bson_deserialize_bits_pr x 4,762,447 ops/sec ±0.17% (395 runs sampled)
bson_serialize_current_main x 2,382,100 ops/sec ±0.14% (393 runs sampled)
bson_serialize_bits_pr x 2,547,732 ops/sec ±0.12% (394 runs sampled) =================== testing with an object with 1000 keys all set to -120n, bytes 12895
do_nothing x 851,099,356 ops/sec ±0.06% (396 runs sampled)
bson_deserialize_current_main x 3,764 ops/sec ±0.07% (395 runs sampled)
bson_deserialize_bits_pr x 9,259 ops/sec ±0.12% (394 runs sampled)
bson_serialize_current_main x 10,604 ops/sec ±0.10% (395 runs sampled)
bson_serialize_bits_pr x 16,260 ops/sec ±0.12% (395 runs sampled) |
6e70471
to
4d69b12
Compare
Looks freaking good ! I have crazy load of work right now, but I was thinking of an other possible improvement, we use Buffer.allocate at multiple places where we could use the faster allocateUnsafe function since we directly set the bytes. I don't know how much improvement it could make but it's safe to use it in those cases : const bytes = ByteUtils.allocate(16);
// Copy the next 16 bytes into the bytes buffer
bytes.set(buffer.subarray(index, index + 16), 0); Or } else if (elementType === constants.BSON_DATA_OID) {
const oid = ByteUtils.allocate(12);
oid.set(buffer.subarray(index, index + 12));
value = new ObjectId(oid);
index = index + 12;
} |
@billouboq Good idea, I'll file a ticket about it, I do wish there was a web equivalent, but we can incorporate it for node at least. |
8d99e9b
to
46b5060
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
What is changing?
We keep a Float64Array and a Uint8Array referencing the same ArrayBuffer. With 8 lines worth of assignment we can take the bytes from input BSON and put them into the Uint8Array and use the Float64Array to interpret them as a LE double.
For bigints we can shift the value into two 32 bit components and use javascript's truncation logic to pull out the bytes as we shift down to each segment of the number.
Is there new documentation needed for these changes?
No
What is the motivation for this change?
Creating data views has an impact on peformance, and we can easily avoid using it altogether. Generally, we see better performance the more logic we code in JavaScript.
Benchmarks See below
Release Highlight
Improved the performance of serializing and deserializing doubles and bigints
We now use bit shifting and multiplication operators in place of DataView getX/setX calls to parse and serialize bigints and a Float64Array to convert a double to bytes. This change has been shown to increase deserializing performance ~1.3x and serializing performance ~1.75x.
Double check the following
npm run check:lint
scripttype(NODE-xxxx)[!]: description
feat(NODE-1234)!: rewriting everything in coffeescript