Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement counted b-trees. #10

Open
bigeasy opened this issue May 17, 2012 · 6 comments
Open

Implement counted b-trees. #10

bigeasy opened this issue May 17, 2012 · 6 comments
Assignees

Comments

@bigeasy
Copy link
Owner

bigeasy commented May 17, 2012

Counted b-trees means that we keep a count for every page, but the user will have to pay for the count, it will be calculated.

When we open a b-tree, we visit all of the leaf pages. All of them. However, we only read from the back of the page to find the the address array so that we can read the address of the next page and the count of records in the file.

Alternatively, we can append this information every time we write a record to the page, or we can append the file position of the last position array written.

You know, that's not a bad idea. As long as we've made the file brittle with file positions, we may as well capitalize on the benefits. Once you've gone and editing a file in Vim, you've pretty much thrown everything off anyway. There will be utilities to tidy a file that you recover in a text editor, essentially a compiler, but once you start poking around with the file, it has become source code. The checksums are off, the file positions are off, etc.

The compiler can play the leaf page as a log and rewrite, reporting failed checksums, but not entirely freaking out and dying, because it's purpose is to rebuild hand edited leaf files. Something that no one should ever do, but possibly a great recovery feature, that you can extract all your data as plain JSON.

@ghost ghost assigned bigeasy May 17, 2012
@bigeasy
Copy link
Owner Author

bigeasy commented May 18, 2012

File positions are useless because a position array is the count of records at a point in time. You need to move forward through the file in order to get the actual count.

We need to append the count with every record.

Also, the page will only change when the file is rewritten, so we can probably put that in the first record, so we can load our counts by reading the first and last record of every leaf page in the database.

@bigeasy
Copy link
Owner Author

bigeasy commented May 19, 2012

We're recording the address of the right sibling leaf page in the position array entry. We need to keep it there. When we load the leaf, we move backward until we find it. We don't want to also have to open the front of the file as well, that becomes to jittery.

We simply write a position array entry at the head of the file, simply to record the right sibling leaf page.

@bigeasy
Copy link
Owner Author

bigeasy commented May 20, 2012

Our counted b-tree implementation requires that we crack every leaf file when we reopen an index, to get the counts from the file. If we're going to do this, get all the counts, we can take this opportunity to check our balance, see if there are any potential splits or merges that might have been interrupted, or never executed because of a hard shutdown.

@bigeasy
Copy link
Owner Author

bigeasy commented May 20, 2012

Can we go faster on load if either the sibling is in the file name, or else the file has an extension if it is a leaf? Sibling in filename would make loading difficult, need to pattern match to find the file. Extension means we can load the counts by looking at the end of every file, without having to look at the front of the file to find the next leaf.

bigeasy pushed a commit that referenced this issue May 25, 2012
We not store a checksum of each line of JSON in branch page file and
leaf page files. Until we implement a CRC 32, we're using SHA1, which is
available through the `crypto` Node.js module.

The checksum is appended as a hexadecimal number after the JSON string
on each line of a branch page file or a leaf page file. Closes #9.

The test fixtures have been updated to read and write b‑tree file
directories containing branch page files and leaf page files with
checksummed lines of JSON.

Each leaf page entry now includes an entry number. This can be used to
verify the integrity of the file, to insure that no entries are missing.
The entry number should increase by one with each entry. A break in the
series indicates corruption. Closes #25.

We added this because we checksum each line of a leaf page file, but we
do not checksum the entire leaf page file. We know that each line is
correct, but without the entry count, we don't know if any lines or
missing, repeated, or out of order.

Each insert and delete entry now includes the record count of the leaf
page including the effect of the entry. This was added in anticipation
of counted our b‑tree implementation described in #10.
Closes #26.

Each position array entry now includes a leaf page file format version
number, because changes to file format are inevitable. Current version
number is `1`. Closes #28.

A position array entry is always the first entry of a leaf page file.
We'll be able to look at the head of the file to find the right sibling
leaf page. This is part of a future counted b‑tree strategy of
skimming leaf pages at startup to read leaf page record counts.
Closes #29.

Removed a long chunk of exposition on the editablity of leaf page files.
They're not really meant for hacking with `vim`. They can be managed
with `git`, viewed with `less`, watched with `tail -f`, and munged using
a Node.js program of the adopter's design. We store file positions
however, so making an edit in your favorite text editor is likely to
throw those file positions off.

Part of the value of JSON and plain text is the visibility. Part of the
value is that an application developer can restore corrupt files using
utilities written in response to a specific instance of corruption,
informed by the visibility. The data is still there. You can read
through a corrupted file, recovering what can be recovered, inserting it
into a new b‑tree.

Also, I noticed that I was not adding the dot to the file suffixes when
rewriting files in create. I added the dots.
@bigeasy
Copy link
Owner Author

bigeasy commented Aug 19, 2013

This is becoming an externalized implementation, through a recipe, just like null keys or duplicate keys. One can use a version number and MVCC to implement counted b-trees in an external tree. I'm not sure it belongs here, or in Memento, but I'll leave it here for now.

@bigeasy
Copy link
Owner Author

bigeasy commented Nov 26, 2015

Counting can be implemented by writing the count increase for each leaf page when the leaf page is locked. The count log for a leaf page would increment or decrement the count for each branch page visited during the descent. The count for each page would be kept in memory, in a hash table or in Magazine so it would be updated synchronously.

However, as I consider this I begin to realize that a counted b-tree internally to Strata doesn't matter so much. What good does it do if you're using the MVCC library to create a versioned tree?

As a user, you could use the lock you have on a leaf to update your housekeeping. You could then perform housekeeping by getting a mutator lock on the leaf, then updating the counts on all the pages. But, to really use this, it must be somewhat internal, so you can get a count based on a range of values, for example.

Then how do you version the count? This is a problem for a higher level implementation. The thought I'm trying to capture is that you can use the lock on the leaf to log changes, keep he counts in memory, then have a writer thread log all the branch counts. It manages the compexity in the same way that the balancer is only one to operate on branches.

bigeasy pushed a commit that referenced this issue Sep 21, 2020
bigeasy pushed a commit that referenced this issue Sep 23, 2020
 * Use new `Journalist.drain()` in shutdown. See #662.
 * Implement `Strata.drain()`, Cursor 100% coverage. Closes #662.
 * Remove ghosts. Closes #580. Closes #660.
 * Calculate and return heft on insert/remove. Closes #661.
 * Notes on counts and Merkle. See #10.

Closes #663.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant