-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Filter to TrieWalker #1598
Conversation
84dba35
to
2568b79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments. Nothing deal breaking but I would say they are nice improvements.
}; | ||
} | ||
|
||
let slice_size = U256::MAX / U256::from(slice_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this logic is accurate. Let's see that happens if slice_count
is power of 2 (most likely value to be used), e.g: 16.
Because U256::MAX = 0xffff..ff
, then slice_size = 0x0fff..ff
, leading to following:
- first slice would be
[ 0x0000..00 - 0x0fff..fe]
- second slice would be
[0x0ffff..ff - 0x1fff..fd]
- third slice would be
[0x1fff..fe - 0x2fff..fc]
- ...
- last slice would be
[0xefff..f1 - 0xffff..ff]
This would make last slice bigger, than the rest, but that's not the main issue. The main issue is that slices break in unexpected places, and hexary trie is not split in 16 equal parts as expected.
To fix it, I think we should do: let slice_size = U256::MAX / U256::from(slice_count) + 1
.
This works perfectly when slice_count
is power of two.
If slice_count
is not power of two, we know that some slices will be bigger than the rest, and trie will be broken in weird way anyway.
In your implementation, the last one would be bigger than the rest and in mine it would be smaller than the rest. And it both cases, the difference shouldn't be more than slice_count
. So I see both of them as equally good solutions.
} | ||
|
||
impl Filter { | ||
pub fn new_random_filter(slice_count: u16) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would let the caller control which slice is being created. Meaning, I would change this function to:
pub fn new(slice_index: u16, slice_count: u16) -> Self {
assert!(slice_index < slice_count, "...");
...
}
This way, we can also add more tests and make sure that slice splitting works correctly for non-trivial cases (and prevent the issue I explained in the other comment).
At least, we should have a function like this. And if you want to keep new_random_filter
, then we can do something like:
pub fn random(slice_count) -> Self {
Self::new(thread_rng().gen_range(0..slice_count), slice_count)
}
assert!(filter.is_included(&[0x1, 0x5, 0x5])); | ||
assert!(!filter.is_included(&[0x1, 0x5, 0x4])); | ||
assert!(filter.is_included(&[0x2])); | ||
assert!(filter.is_included(&[0x3])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would add one more example:
assert!(!filter.is_included(&[0x3, 0x0, 0x1]));
} | ||
|
||
fn partial_prefix(prefix: U256, shift_amount: usize) -> B256 { | ||
prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to do one of the following:
prefix | (U256::MAX << shift_amount)
- I think this is the cleanest- I would go one step further and say that
shift_amount
is not clear what it means. Instead, I would use opposite value: -
let prefix_length = 4 * path.len(); let partial_prefix_mask = !(U256::MAX >> prefix_length); let partial_start_prefix = self.start_prefix & partial_prefix_mask; let partial_end_prefix = self.end_prefix & partial_prefix_mask;
- I would go one step further and say that
(prefix >> shift_amount) << shift_amount
- same idea as yours, but simpler
And in either of those cases, this is simple enough and I would inline it below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I choose 2 as it is a simpler to understand
|
||
#[derive(Debug, Clone)] | ||
pub struct Filter { | ||
start_prefix: U256, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why do we call this _prefix
? Wouldn't start
and end
be sufficient and more correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
trin-execution/src/utils.rs
Outdated
B256::from_slice(&compress_nibbles(key_path)) | ||
} | ||
|
||
pub fn partial_nibble_path_to_right_padded_b256(partial_nibble_path: &[u8]) -> B256 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fan of converting them to Vec
and then to B256
.
How about:
pub fn nibbles_to_right_padded_b256(nibbles: &[u8]) -> B256 {
let mut result = B256::ZERO;
for (i, nibble) in nibbles.iter().enumerate() {
if i % 2 == 0 {
result[i / 2] |= nibble << 4;
} else {
result[i / 2] |= nibble;
};
}
result
}
We also don't need compress_nibbles
and we can modify full_nibble_path_to_address_hash
to:
pub fn full_nibble_path_to_address_hash(key_path: &[u8]) -> B256 {
if key_path.len() != 64 {
panic!(
"Key path should always be 64 bytes long: {}",
key_path.len()
);
}
nibbles_to_right_padded_b256(key_path)
}
Actually, there is one thing, and we can decide if it is a problem or not. Let's say we have two Filters (
And let's say that trie node at path Intuitively, one would expect that this trie node (since it's leaf) would be visited only by TrieWalker that has FilterB, and not be visited by TrieWalker with FilterA. But it would actually be visited by both of them. I personally don't think this is an issue. But I wanted to raise awareness and get your opinion as well. |
I made Filter more inclusive, because I thought it was simpler to reason about that way. Especially well walking diverging paths in the branch node. I don't think including a few extra nodes is a issue here as you said. |
@morph-dev ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Create a new filter that includes the whole trie | ||
/// Slice index must be less than slice count or it will panic | ||
pub fn new(slice_index: u16, slice_count: u16) -> Self { | ||
assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition would fail if slice_count
is zero (which seems to be supported case otherwise).
assert_eq!(filter.end_prefix, U256::MAX); | ||
let filter = Filter::random(1); | ||
assert_eq!(filter.start, U256::ZERO); | ||
assert_eq!(filter.end, U256::MAX); | ||
} | ||
|
||
#[test] | ||
fn test_is_included() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should be renamed to contains
} | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use alloy::hex::FromHex; | ||
|
||
use super::*; | ||
|
||
#[test] | ||
fn test_new_random_filter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: names of tests usually don't start with test_
(it's redundant since they are already inside tests
module)
Awesome |
What was wrong?
Walking a whole state trie is very time consuming
How was it fixed?
By making it so you can filter the trie to only walk a slice of it, this PR will be used in my followup PR, which adds a random walking state snapshot mode.
This mode will be useful for the random slice state snapshot bridge mode I will be adding