Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Targeted Fuzzing #482

Open
plotchy opened this issue May 20, 2024 · 2 comments
Open

Targeted Fuzzing #482

plotchy opened this issue May 20, 2024 · 2 comments

Comments

@plotchy
Copy link
Contributor

plotchy commented May 20, 2024

Since the EVM is stateful, and all interesting things onchain occur during a stateful operation, what if we targeted those during the fuzz run?

There is some research around targeting specific operations. In web2 this is around santizers placed by LLVM. A paper like FishFuzz demonstrates how effective this can be.

Since the EVM has very specific operations that can change state, I think tracking those and scheduling testcases to go after those operations (rather than coverage) would be useful.

  1. Create control flow graph of runtime bytecode
  2. Find all state changing operations
// we track all of the pc's that can be targets (anything that can change evm state)
pub const TARGET_OPS: [u8; 7] = [SSTORE, CALL, DELEGATECALL, CALLCODE, SELFDESTRUCT, CREATE, CREATE2];
  1. Create a distance score between all nodes of the CFG. ie: (CFG_block_1, CFG_block_2, score)
  2. When scheduling testcases, prioritize seeds that are closest to unreached targets (Exploring mode)
  3. After a time threshold, switch phase to prioritize seeds that have reached targets (Exploiting mode)

Concerns

Ityfuzz can be used as a test-suite, where it attempts to find lines of code with assert or AssertionFailed("Bug"). By switching scheduling to prioritize state changing events, neither of these assert targets would be prioritized by the scheduler and they would take longer to find

@shouc
Copy link
Contributor

shouc commented May 25, 2024

Sounds interesting and promising! There seems to be no work on distance-based scheduling for stateful ops.

It would be relatively easy to implement in ItyFuzz as you can directly implement a new power schedule like: https://github.com/fuzzland/ityfuzz/blob/master/src/evm/scheduler.rs#L378-L411

@plotchy
Copy link
Contributor Author

plotchy commented May 27, 2024

@shouc

At the moment I've been testing through modifying the PowerScheduler's next() fn to only select from testcases that I've chosen as closest.

I chose to do it this way as you can "prune" the corpus even as it grows. Since corpus is added every time coverage gets better, its gets to be pretty large (like BIGFI_exp.txt hits 300+ in a few mins [32m INFO�[0m [Testcase #0] run time: 0h-1m-30s, clients: 1, corpus: 373, objectives: 0, executions: 269991, exec/sec: 3.015k).

Since next() always iterates from index 0 to last and loops back to 0, it wastes a lot of time on early indexes that are no longer best.

Do you think keeping next() as is and modifying compute() to run the close testcases more is better? I can see there being advantages to running old inputs every so often but would be worried about very large corpuses wasting time

fn next(&mut self, state: &mut Self::State) -> Result<CorpusId, Error> {
if state.corpus().count() == 0 {
Err(Error::empty("No entries in corpus".to_owned()))
} else {
let id = state
.corpus()
.current()
.map(|id| state.corpus().next(id))
.flatten()
.unwrap_or_else(|| state.corpus().first().unwrap());
self.set_current_scheduled(state, Some(id))?;
Ok(id)
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants