Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Player. #470

Open
bigeasy opened this issue Feb 8, 2015 · 3 comments
Open

Optimize Player. #470

bigeasy opened this issue Feb 8, 2015 · 3 comments
Assignees
Milestone

Comments

@bigeasy
Copy link
Owner

bigeasy commented Feb 8, 2015

Player has three potential bottlenecks; checksum generation, deserialization and splicing. Checksum generation and testing is something that we do that others might not do, so we are going to pay a price for those checksums in the shootouts. Checksums require slices. Maybe it is faster if you use a JavaScript checksum modified to work with buffers and ranges. If it comes down to all benchmarks all the time, then we're going to want to turn them off in Locket, or else wait until they save someone's bacon.

Deserialization might be improved by using a binary file format. Maybe it is enough to can for the end of the header array explicitly. Maybe it is important to go binary. You could create a translator that would read a binary file format and emit your pretty text format. Binary means you could use a Packet compiler and have a best-foot forward reader. It would be easier to checksum without having to read into buffers.

You might want to consider making record serialization pluggable.

I'm concerned that the Magazine lookup on each splice is costing too much. The Cartridge could be stored in the page, so adjusting heft could be done without the lookup.

Reading the buffer from file seems to happen quickly enough and there is little that I can do to make Node.js load a file faster. All performance must be squeezed from the synchronous buffer reading loop.

@bigeasy bigeasy self-assigned this Feb 8, 2015
@bigeasy bigeasy added this to the Performance milestone Feb 8, 2015
@bigeasy
Copy link
Owner Author

bigeasy commented Feb 8, 2015

A binary format means no scanning and then reading, just reading.

@bigeasy
Copy link
Owner Author

bigeasy commented Feb 9, 2015

Two things that will make Locket slow compared to LevelDB, these are two things that Locket does that LevelDB does not. First, the checksums. I'm not sure that LevelDB does such a thing. Second it is the UTF-8 file format, which is easy to read. Should there be some corruption, it would be easy to open the leaves in a file. A binary file format is going to be even worse in Locket, since I must also parse the separation of key and value, which means a key length needs to be stored. This would all probably go much faster if it where all binary.

To stand up in a shootout, we're going to need to turn off checksums and use a binary parser. These can be switches, a switch to turn the checksums on, and a switch to use the UTF-8 format, so that, by default we have no checksum and use a binary format.

bigeasy pushed a commit that referenced this issue Feb 9, 2015
bigeasy pushed a commit that referenced this issue Feb 9, 2015
insert: 106, gather: 75

See #470.
bigeasy pushed a commit that referenced this issue Feb 9, 2015
insert: 107, gather: 74

See #470.
Closes #465.
@bigeasy
Copy link
Owner Author

bigeasy commented Feb 9, 2015

No improvement from moving slice into Page and caching the cartridge in the Page. That leaves checksums and a binary file format.

bigeasy pushed a commit that referenced this issue Feb 11, 2015
Also created different checksum formats. SHA1 from `crypto` is still
faster than non-cryptograhic checksums written in JavaScript.

See #474.
Closes #471.
See #470.
bigeasy pushed a commit that referenced this issue Feb 12, 2015
JavaScript DJB is faster than `crypto` SHA1. The binary frame is faster
than the UTF8 frame, of course. There is still room for improvement in
read performance by calling `splice` directly on the items, or at least
removing the invocation of `reduce` in `Page.splice`.

insert: 61, gather: 36

See #470.
bigeasy pushed a commit that referenced this issue Feb 12, 2015
bigeasy pushed a commit that referenced this issue Feb 12, 2015
Use a `for` loop instead. It is faster. Especially considering that the
list of items is often a list of one.

insert: 60, gather: 34

See #470.
@bigeasy bigeasy added the ready label Feb 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant