Skip to content

FutureComputing4AI/Reverse-Engineering-Function-Search

Repository files navigation

This repository contains the code used to build the models and run the experiments presented in Is Function Similarity Over-Engineered? Building a Benchmark. The paper can be found here.

Our paper evaluates five models: jTrans, a GNN from Li et. al, A Naive Multiheaded-Attention Transformer Encoder, Ghidra's BSim Plugin, and REFuSe, a new model introduced in our paper. These models are assessed against five datasets: Assemblage, MOTIF, CommonLibraries, Marcelli Dataset-1, and BinaryCorp.

data
The recipe for building our Assemblage dataset, and the code to run our BSim experiments. A guide to reproducing datasets from recipes can be found here, and more details about BSim can be found in the corresponding folder.

data-processing
Preprocess data for experiments.

model-training
Train models on Assemblage data.

model-evaluation
Evaluate models on datasets.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published