GitHub - FutureComputing4AI/Reverse-Engineering-Function-Search

This repository contains the code used to build the models and run the experiments presented in Is Function Similarity Over-Engineered? Building a Benchmark. The paper can be found here.

Our paper evaluates five models: jTrans, a GNN from Li et. al, A Naive Multiheaded-Attention Transformer Encoder, Ghidra's BSim Plugin, and REFuSe, a new model introduced in our paper. These models are assessed against five datasets: Assemblage, MOTIF, CommonLibraries, Marcelli Dataset-1, and BinaryCorp.

data
The recipe for building our Assemblage dataset, and the code to run our BSim experiments. A guide to reproducing datasets from recipes can be found here, and more details about BSim can be found in the corresponding folder.

data-processing
Preprocess data for experiments.

model-training
Train models on Assemblage data.

model-evaluation
Evaluate models on datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data-processing		data-processing
data		data
model-evaluation		model-evaluation
model-training		model-training
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
REFuSe-Bench_appendix.pdf		REFuSe-Bench_appendix.pdf
datasheet.md		datasheet.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

FutureComputing4AI/Reverse-Engineering-Function-Search

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages