feat: add Filter to TrieWalker #1598

KolbyML · 2024-12-05T05:06:34Z

What was wrong?

Walking a whole state trie is very time consuming

How was it fixed?

By making it so you can filter the trie to only walk a slice of it, this PR will be used in my followup PR, which adds a random walking state snapshot mode.

This mode will be useful for the random slice state snapshot bridge mode I will be adding

morph-dev

Few minor comments. Nothing deal breaking but I would say they are nice improvements.

morph-dev · 2024-12-06T09:43:59Z

trin-execution/src/trie_walker/filter.rs

+            };
+        }
+
+        let slice_size = U256::MAX / U256::from(slice_count);


I don't think this logic is accurate. Let's see that happens if slice_count is power of 2 (most likely value to be used), e.g: 16.

Because U256::MAX = 0xffff..ff, then slice_size = 0x0fff..ff, leading to following:

first slice would be [ 0x0000..00 - 0x0fff..fe]

second slice would be [0x0ffff..ff - 0x1fff..fd]

third slice would be [0x1fff..fe - 0x2fff..fc]

...

last slice would be [0xefff..f1 - 0xffff..ff]

This would make last slice bigger, than the rest, but that's not the main issue. The main issue is that slices break in unexpected places, and hexary trie is not split in 16 equal parts as expected.

To fix it, I think we should do: let slice_size = U256::MAX / U256::from(slice_count) + 1.

This works perfectly when slice_count is power of two.

If slice_count is not power of two, we know that some slices will be bigger than the rest, and trie will be broken in weird way anyway.

In your implementation, the last one would be bigger than the rest and in mine it would be smaller than the rest. And it both cases, the difference shouldn't be more than slice_count. So I see both of them as equally good solutions.

morph-dev · 2024-12-06T09:48:03Z

trin-execution/src/trie_walker/filter.rs

+}
+
+impl Filter {
+    pub fn new_random_filter(slice_count: u16) -> Self {


I would let the caller control which slice is being created. Meaning, I would change this function to:

pub fn new(slice_index: u16, slice_count: u16) -> Self { assert!(slice_index < slice_count, "..."); ... }

This way, we can also add more tests and make sure that slice splitting works correctly for non-trivial cases (and prevent the issue I explained in the other comment).

At least, we should have a function like this. And if you want to keep new_random_filter, then we can do something like:

pub fn random(slice_count) -> Self { Self::new(thread_rng().gen_range(0..slice_count), slice_count) }

morph-dev · 2024-12-06T09:54:01Z

trin-execution/src/trie_walker/filter.rs

+        assert!(filter.is_included(&[0x1, 0x5, 0x5]));
+        assert!(!filter.is_included(&[0x1, 0x5, 0x4]));
+        assert!(filter.is_included(&[0x2]));
+        assert!(filter.is_included(&[0x3]));


nit: I would add one more example:

assert!(!filter.is_included(&[0x3, 0x0, 0x1]));

morph-dev · 2024-12-06T12:04:54Z

trin-execution/src/trie_walker/filter.rs

+    }
+
+    fn partial_prefix(prefix: U256, shift_amount: usize) -> B256 {
+        prefix


Any reason not to do one of the following:

prefix | (U256::MAX << shift_amount) - I think this is the cleanest

I would go one step further and say that shift_amount is not clear what it means. Instead, I would use opposite value:

let prefix_length = 4 * path.len(); let partial_prefix_mask = !(U256::MAX >> prefix_length); let partial_start_prefix = self.start_prefix & partial_prefix_mask; let partial_end_prefix = self.end_prefix & partial_prefix_mask;

(prefix >> shift_amount) << shift_amount - same idea as yours, but simpler

And in either of those cases, this is simple enough and I would inline it below.

I choose 2 as it is a simpler to understand

morph-dev · 2024-12-06T12:13:43Z

trin-execution/src/trie_walker/filter.rs

+
+#[derive(Debug, Clone)]
+pub struct Filter {
+    start_prefix: U256,


nit: why do we call this _prefix? Wouldn't start and end be sufficient and more correct?

morph-dev · 2024-12-06T12:38:19Z

trin-execution/src/utils.rs

+    B256::from_slice(&compress_nibbles(key_path))
+}
+
+pub fn partial_nibble_path_to_right_padded_b256(partial_nibble_path: &[u8]) -> B256 {


Not a big fan of converting them to Vec and then to B256.

How about:

pub fn nibbles_to_right_padded_b256(nibbles: &[u8]) -> B256 { let mut result = B256::ZERO; for (i, nibble) in nibbles.iter().enumerate() { if i % 2 == 0 { result[i / 2] |= nibble << 4; } else { result[i / 2] |= nibble; }; } result }

We also don't need compress_nibbles and we can modify full_nibble_path_to_address_hash to:

pub fn full_nibble_path_to_address_hash(key_path: &[u8]) -> B256 { if key_path.len() != 64 { panic!( "Key path should always be 64 bytes long: {}", key_path.len() ); } nibbles_to_right_padded_b256(key_path) }

morph-dev · 2024-12-06T12:54:19Z

Actually, there is one thing, and we can decide if it is a problem or not.

Let's say we have two Filters ( slice_count is 512)

FilterA { start: 0xab10..00, end: 0xab7f..ff }
FilterB { start: 0xab80..00, end: 0xabff..ff }

And let's say that trie node at path [0xa, 0xb] is a Leaf node, with prefix: [0xc, 0xd, ...] (making entire path for the data stored inside the leaf node: 0xabcd...).

Intuitively, one would expect that this trie node (since it's leaf) would be visited only by TrieWalker that has FilterB, and not be visited by TrieWalker with FilterA. But it would actually be visited by both of them.

I personally don't think this is an issue. But I wanted to raise awareness and get your opinion as well.

KolbyML · 2024-12-09T07:00:49Z

Actually, there is one thing, and we can decide if it is a problem or not.

Let's say we have two Filters ( slice_count is 512)

FilterA { start: 0xab10..00, end: 0xab7f..ff }

FilterB { start: 0xab80..00, end: 0xabff..ff }

And let's say that trie node at path [0xa, 0xb] is a Leaf node, with prefix: [0xc, 0xd, ...] (making entire path for the data stored inside the leaf node: 0xabcd...).

Intuitively, one would expect that this trie node (since it's leaf) would be visited only by TrieWalker that has FilterB, and not be visited by TrieWalker with FilterA. But it would actually be visited by both of them.

I personally don't think this is an issue. But I wanted to raise awareness and get your opinion as well.

I made Filter more inclusive, because I thought it was simpler to reason about that way. Especially well walking diverging paths in the branch node. I don't think including a few extra nodes is a issue here as you said.

KolbyML · 2024-12-09T07:13:37Z

@morph-dev ready for another review

morph-dev

morph-dev · 2024-12-09T15:52:48Z

trin-execution/src/trie_walker/filter.rs

+    /// Create a new filter that includes the whole trie
+    /// Slice index must be less than slice count or it will panic
+    pub fn new(slice_index: u16, slice_count: u16) -> Self {
+        assert!(


This condition would fail if slice_count is zero (which seems to be supported case otherwise).

morph-dev · 2024-12-09T15:55:58Z

trin-execution/src/trie_walker/filter.rs

-        assert_eq!(filter.end_prefix, U256::MAX);
+        let filter = Filter::random(1);
+        assert_eq!(filter.start, U256::ZERO);
+        assert_eq!(filter.end, U256::MAX);
    }

    #[test]
    fn test_is_included() {


nit: should be renamed to contains

morph-dev · 2024-12-09T15:57:06Z

trin-execution/src/trie_walker/filter.rs

    }
 }

 #[cfg(test)]
 mod tests {
-    use alloy::hex::FromHex;
-
    use super::*;

    #[test]
    fn test_new_random_filter() {


nit: names of tests usually don't start with test_ (it's redundant since they are already inside tests module)

Skippyy9 · 2024-12-09T16:24:09Z

Awesome

KolbyML requested review from morph-dev and removed request for morph-dev December 5, 2024 05:06

KolbyML self-assigned this Dec 5, 2024

KolbyML marked this pull request as draft December 5, 2024 05:06

KolbyML requested a review from morph-dev December 5, 2024 05:13

feat: add Filter to TrieWalker

2568b79

KolbyML force-pushed the add-random-snapshot-gossip branch from 84dba35 to 2568b79 Compare December 5, 2024 05:15

KolbyML marked this pull request as ready for review December 5, 2024 05:29

fix: potential endcase, were nodes would be ignored

49088d6

morph-dev reviewed Dec 6, 2024

View reviewed changes

fix: resolve PR concerns

6d1d562

fix: ci error

4559905

morph-dev approved these changes Dec 9, 2024

View reviewed changes

fix: resolve PR concerns

aedd0f0

KolbyML merged commit 580a72b into ethereum:master Dec 9, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Filter to TrieWalker #1598

feat: add Filter to TrieWalker #1598

KolbyML commented Dec 5, 2024

morph-dev left a comment

morph-dev Dec 6, 2024

morph-dev Dec 6, 2024

morph-dev Dec 6, 2024

morph-dev Dec 6, 2024

KolbyML Dec 9, 2024

morph-dev Dec 6, 2024

KolbyML Dec 9, 2024

morph-dev Dec 6, 2024

morph-dev commented Dec 6, 2024

KolbyML commented Dec 9, 2024 •

edited

Loading

KolbyML commented Dec 9, 2024

morph-dev left a comment

morph-dev Dec 9, 2024

morph-dev Dec 9, 2024

morph-dev Dec 9, 2024

Skippyy9 commented Dec 9, 2024

feat: add Filter to TrieWalker #1598

feat: add Filter to TrieWalker #1598

Conversation

KolbyML commented Dec 5, 2024

What was wrong?

How was it fixed?

morph-dev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

morph-dev commented Dec 6, 2024

KolbyML commented Dec 9, 2024 • edited Loading

KolbyML commented Dec 9, 2024

morph-dev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Skippyy9 commented Dec 9, 2024

KolbyML commented Dec 9, 2024 •

edited

Loading