Head tracking batch #417

Scooletz · 2024-10-23T16:00:05Z

This PR introduces a notion of a IMultiHeadChain. The multi head approach enables creating multiple heads of the blockchain that track the recent state properly. If no reorganization happens, the same IHead is used over and over for a prolonged period of time while ensuring that all the reads are served in a speed of reading the raw underlying database. This is done by applying COW not only on the underlying database level but also in the head with a help of a PageTable. With this we have:

Reusability of a batch instance
Removal of ancestors' notion
No filters and direct db queries
Easier move from block N to N+1

The reusability is addressed by reusing the head tracking branch through multiple blocks. Every time a block is committed, it's state is applied to internal state of HeadTrackingBatch in a specific manner, allowing tracking of blocks.

HeadTrackingBatch has no notion of ancestors as it keeps all the modified pages in a simple lookup, that allows to perceive a squashed view of the world.

No BitFilter is requires as data are always queried like there would be from the database. The only difference is that some pages are represented as modified in-memory copies. These copies eventually are applied to the disk, but for sake of lookups, are treated like there were applied already.

IHead

The idea is implemented by introduction of one more level of abstraction over a batch. Before this PR, only a single PageDb.Batch could exist. This PR introduces a new type of the batch, called IMultiHeadChain. From the chain you can get IHead, that accumulates changes in memory, meaning, that at a given batch number, multiple propositions can exist before they are committed/discarded. To make it work, the head introduces two dictionaries that allow resolving Page <-> DbAddress mapping in both sides. There are used to track pages that were already overwritten. It's possible that a given page will be overwritten multiple times though, so that the mappings keep only the last version. The mapping using dictionary seems to be costly (it was shown in the profiling). To address this, a dummy array-based lookup is provided in front of it.

ProposedBatch

A simple construct of a ProposedBatch is used to track:

a list of (DbAddress,Page) tuples
a RootPage
a position of the root page to apply to (calculated from the batchId in the root page)

These proposed blocks are created every time HeadTrackingBatch is committed, by selecting pages that were updated during the given batch. Once a finality is reached, proposed batches that have block numbers lower than the finalized one, are applied on the database. The application should be fast and almost instantaneous as it copies payloads to proper pages, updates the RootPage and decreases the counter on the ProposedBatch (counter based managed similar to blockchain).

Readers

Readers manage the read access. They are created whenever a block is committed/finalized and are just retrieved by the client code. No readers should be created in the hot paths!

Benchmarks

Some areas of interest

BeforeCommit that represents Merkleization
CommitImpl that applies the changes to the in-memory representation. Potentially can be offloaded
The actual Commit that registers the proposal

src/Paprika/Store/PagedDb.cs

Scooletz · 2024-10-28T11:00:14Z

@dipkakwani Maybe I should have been more explicit about it. This is a sketch of the idea. I wanted to gather some input while running the benchmarks with the current setup.

…bandoned

dipkakwani · 2024-12-09T08:24:13Z

src/Paprika/Store/PagedDb.cs

        {
-            // the 0th page will have the properly number set to first free page
-            _roots[0].Data.NextFreePage = DbAddress.Page(_historyDepth);
+            // The start root must have the properly number set to first free page


Nit: requires grammar fix.

Can you suggest a nice way of stating this?

Not exactly sure, but do you mean to state:
"The start root's NextFreePage must point to the very first free page"?

src/Paprika/Store/PagedDb.cs

dipkakwani · 2024-12-09T08:35:13Z

src/Paprika/Store/PagedDb.cs

@@ -115,13 +112,14 @@ private PagedDb(IPageManager manager, byte historyDepth)
        _lastWriteTxBatch = _meter.CreateAtomicObservableGauge($"Last written {BatchIdName}", BatchIdName,
            "The last ");

+


Issue in the existing code (one line above): _lastWriteTxBatch description is incomplete.

dipkakwani · 2024-12-09T10:48:29Z

src/Paprika/Store/PagedDb.MultiHeadChain.cs

+                        _pageTable.Remove(at);
+                        _pageTableReversed.Remove(actual);
+
+                        ref var cached = ref GetPageTableCacheSlot(at);


Should this cache be checked before the _pageTable?
If it's present in cache, we can be sure it exists in _pageTable and directly remove it from both the places? If it's not available, we will have to read the _pageTable to find out.

I find it easier to reason about as it's structured now, because in the current take, we treat _pageTable as a source of truth and a cache only as an optimization. The time spent in this part is small when comparing to, for example, the application time. Also, most likely all the pages that are tried to be found will be found, so not sure if worth to do it the other way. Maybe we could optimize and .Remove always and try to fix when removed incorrectly? Still, the measured overhead was terribly small.

dipkakwani · 2024-12-09T11:14:06Z

src/Paprika/Store/PagedDb.MultiHeadChain.cs

+        private readonly ReaderWriterLockSlim _readerLock = new();
+
+        // Proposed batches that are finalized
+        private readonly HashSet<Keccak> _beingFinalized = new();


With this new approach, we are storing a lot of data in-memory and this is not fetched from buffer pool/existing set of DB pages. Couple of general questions:

How big can this grow? What does benchmarks show?

What possible worst-case scenario can arise in case of a crash, since we would lose everything in-memory which had not been checkpointed?

We copy/store only pages that are written. This means roughly ~2000 pages per block which gives 8MB per block. This, multiplied the estimated number of not finalized blocks set to ~64 gives us ~500MB of memory. From what I recall when running a node with it, it was much lower though.

We lose what was not finalized. The worst scenario is that once the node is restarted it gets the last remembered hash and processes what is missing. This is not different from the current one.

github-actions · 2024-12-09T11:23:56Z

Package	Line Rate	Branch Rate	Health
Paprika	83%	79%	➖
Summary	83% (4916 / 5956)	79% (1658 / 2094)	➖

Minimum allowed line rate is 75%

dipkakwani · 2024-12-10T06:31:23Z

src/Paprika/Store/PagedDb.MultiHeadChain.cs

+        }
+    }
+
+    private class MultiHeadChain : IMultiHeadChain


It would be good to add some descriptive comment here to summarize this class/approach?
In general, some of the classes/interfaces in this file have their description missing.

sketch of the page table

b0bbeca

dipkakwani reviewed Oct 28, 2024

View reviewed changes

src/Paprika/Store/PagedDb.cs Outdated Show resolved Hide resolved

src/Paprika/Store/PagedDb.cs Outdated Show resolved Hide resolved

Scooletz mentioned this pull request Nov 6, 2024

BitFilter made faster #433

Merged

Scooletz added 4 commits November 12, 2024 12:42

renaming and simplifying HeadBatch

a69330a

renaming

0817c05

revere mapping

922b5a6

more extraction

d1c7675

Scooletz changed the title ~~Page table & linked batches.~~ Head tracking batch Nov 12, 2024

Scooletz added 21 commits November 13, 2024 12:15

Merge branch 'main' into page-table

b3c1469

moving more members, making base context less dependent

2bc911f

stats and build

6409faf

head tracking batch

c218587

separation of apis

96e7c33

towards implementation of the tracking

30edee7

refactored root finding

4dea197

towards testable MultiHeadChain

b199b61

reading from multihead and proper disposal

8e37291

green test with reads

3fb6d8a

GetAtWriting to ensure that MultiHead can track pages properly with a…

164a8ad

…bandoned

clearing

2797d33

multi block

d1d17f6

moving metadata to Commit of the head

6862627

ref counting proposed batches

4bc3872

fix doubled root creation

fee1ed1

new api to support not mmaped writes

86f6675

flusher in MultiHead

989e726

into the channel

83f5882

async disposable and finalization mechanism

33e088a

finality and block lock

fa2f90a

Scooletz added 23 commits November 22, 2024 17:16

concurrent dictionary for mappings

9bccfb3

proper min batch id calculation as well as the reader creation

1afa56b

reversed map removal

41869b7

proper overwrites criteria

671f167

no hash normalization by setting the first root and proper disposal

f074f97

test fixes

316e331

no copy needed

8ac678b

one more with zero Keccak

c1033ec

test fixups

11a5836

Merge branch 'main' into page-table

d89e30a

Reader desposal made awaitable and properly handled in the FlusherTask

aeb0f13

restored BitMapFilter

b05b8a6

Added last finalized api

f73b43e

test added for the LeaseLatest

f099f27

flushing events

717ccd6

throw on not found

365fb3f

MultiHead allows to create non-committable world state

494e783

pageTable cache in front of a dictionary

00397e5

atomic clean up

c9916cb

update cache only on mapping changes

9e48831

override removal made a bit faster

cb1990d

close prefetcher after commit

a790db3

Merge branch 'main' into page-table

bef584a

Scooletz force-pushed the page-table branch from f3f3f32 to bef584a Compare December 9, 2024 10:18

prefetching disposal got right plus test fixes

a686b6c

dipkakwani reviewed Dec 9, 2024

View reviewed changes

Update src/Paprika/Store/PagedDb.cs

06f783e

dipkakwani reviewed Dec 10, 2024

View reviewed changes

Scooletz mentioned this pull request Dec 10, 2024

RlpMemo should work on compressed data #445

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Head tracking batch #417

Head tracking batch #417

Scooletz commented Oct 23, 2024 •

edited

Loading

Scooletz commented Oct 28, 2024 •

edited

Loading

dipkakwani Dec 9, 2024

Scooletz Dec 9, 2024

dipkakwani Dec 9, 2024 •

edited

Loading

dipkakwani Dec 9, 2024

dipkakwani Dec 9, 2024

Scooletz Dec 9, 2024

dipkakwani Dec 9, 2024 •

edited

Loading

Scooletz Dec 9, 2024

github-actions bot commented Dec 9, 2024

dipkakwani Dec 10, 2024

		@@ -115,13 +112,14 @@ private PagedDb(IPageManager manager, byte historyDepth)
		_lastWriteTxBatch = _meter.CreateAtomicObservableGauge($"Last written {BatchIdName}", BatchIdName,
		"The last ");

Head tracking batch #417

Are you sure you want to change the base?

Head tracking batch #417

Conversation

Scooletz commented Oct 23, 2024 • edited Loading

IHead

ProposedBatch

Readers

Benchmarks

Scooletz commented Oct 28, 2024 • edited Loading

dipkakwani Dec 9, 2024

Choose a reason for hiding this comment

Scooletz Dec 9, 2024

Choose a reason for hiding this comment

dipkakwani Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

dipkakwani Dec 9, 2024

Choose a reason for hiding this comment

dipkakwani Dec 9, 2024

Choose a reason for hiding this comment

Scooletz Dec 9, 2024

Choose a reason for hiding this comment

dipkakwani Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

Scooletz Dec 9, 2024

Choose a reason for hiding this comment

github-actions bot commented Dec 9, 2024

dipkakwani Dec 10, 2024

Choose a reason for hiding this comment

Scooletz commented Oct 23, 2024 •

edited

Loading

Scooletz commented Oct 28, 2024 •

edited

Loading

dipkakwani Dec 9, 2024 •

edited

Loading

dipkakwani Dec 9, 2024 •

edited

Loading