This repository contains some tooling to help automate benchmarking, particularly for the regalloc model (as of right now). The current benchmarking tooling works with the llvm test suite, and the chromium performance tests.
Make sure you have local checkouts of the repositories that are used:
cd ~/
git clone https://github.com/llvm/llvm-project
git clone https://github.com/google/ml-compiler-opt
And for benchmarking using the llvm-test-suite:
git clone https://github.com/llvm/llvm-test-suite
For acquiring the chromium source code, please see their
documentation and follow it up by running hooks. You don't need to setup any
builds as the benchmark_chromium.py
script does that automatically.
Make sure that you have a local copy of libtensorflow:
mkdir ~/tensorflow
wget --quiet https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-linux-x86_64-1.15.0.tar.gz
tar xfz libtensorflow-cpu-linux-x86_64-1.15.0.tar.gz -C ~/tensorflow
And make sure you have installed all of the necessary python packages:
pipenv sync --categories "packages ci" --system
You can use the benchmark_llvm_test_suite.py
python script in order to
automatically configure everything to run a benchmark using the latest released
regalloc model:
cd ~/ml-compiler-opt
PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/benchmark_llvm_test_suite.py \
--advisor=release \
--compile_llvm \
--compile_testsuite \
--llvm_build_path=~/llvm-build \
--llvm_source_path=~/llvm-project/llvm \
--llvm_test_suite_path=~/llvm-test-suite \
--llvm_test_suite_build_path=~/llvm-test-suite/build \
--nollvm_use_incremental \
--model_path="download" \
--output_path=./results.json \
--perf_counter=INSTRUCTIONS \
--perf_counter=MEM_UOPS_RETIRED:ALL_LOADS \
--perf_counter=MEM_UOPS_RETIRED:ALL_STORES
This will output a bunch of test information to ./results.json
that can then
be used later on for downstream processing and data analysis.
An explanation of the flags:
--advisor
- This flag specifies the register allocation eviction advisor that is used by LLVM when compiling the test suite. It can be set to eitherrelease
ordefault
depending upon if you want to test the model specified in the--model_path
flag or if you want to test the default register allocation eviction behavior to grab a baseline measurement.--compile_llvm
- This is a boolean flag (can also be set to--nocompile_llvm
) that specifies whether or not to compile LLVM.--compile_testsuite
- Specifies whether or not to compile the test suite.--llvm_build_path
- The path to place the LLVM build in that will be used. This directory will be deleted and remade if the--nollvm_use_incremental
flag is set.--llvm_source_path
- The path to the LLVM source. This cannot be the root path to the LLVM monorepo, it specifically needs to be the path to the llvm subdirectory within that repository.--llvm_test_suite_path
- The path to the llvm-test-suite--llvm_test_suite_build_path
- The path to place the build for the llvm-test-suite. Similar behavior to the LLVM build path.llvm_use_incremental
- Whether or not to do an incremental build of LLVM. If you alread have all the correct compilation flags setup for running MLGO with LLVM, you can set this flag and you should get an extremely fast LLVM build as the only thing changing is the release mode regalloc model.model_path
- The path to the regalloc model. If this is set to "download", it will automatically grab the latest model from the ml-compiler-opt Github. If it is set to "" or "autogenerate", it will use the autogenerated model.output_path
- The path to the output file (in JSON format)perf_counter
- A flag that can be specified multiple times that takes in performance counters in the libpfm format. There can only be up to three performance counters specified due to underlying limitations in Google benchmark.tests_to_run
- This specifies the LLVM microbenchmarks to run relative to the microbenchmarks library in the LLVM test suite build directory. The default values for this flag should be pretty safe and produce good results.
You can also get detailed information on each flag by only passing the --help
flag to the script. You can also see the default values here as a lot of the
flags set in the example above are just to their default value.
You can use the benchmark_chromium.py
script in order to run chromium
benchmarks based on test description JSON files.
Example:
cd ~/ml-compiler-opt
PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/benchmark_chromium.py \
--advisor=release \
--chromium_build_path=./out/Release \
--chromium_src_path=~/chromim/src \
--compile_llvm \
--compile_tests \
--depot_tools_path=~/depot_tools \
--llvm_build_path=~/llvm-build \
--llvm_source_path=~/llvm-project/llvm \
--nollvm_use_incremental \
--model_path="download" \
--num_threads=32 \
--output_file=./output-chromium-testing.json \
--perf_counters=mem_uops_retired.all_loads \
--perf_counters=mem_uops_retired.all_stores
Several of the flags here are extremely similar to/the same as the flags for the llvm test suite, so only the flags unique to the chromium script will be highlighted here.
--chromium_build_path
- The path to place the chromium build in. This path is relative to the chromium source path.-chromium_src_path
- The path to the root of the chromium repository (ie./src/
where you ranfetch --nohooks
)--depot_tools_path
- The path yo your depot tools checkout.--num_threads
- enables parallelism for running the tests. Make sure to use this with caution as it can add a lot of noise to your benchmarks depending upon what specifically you are doing.--perf_counters
- similar to the llvm test suite perf counters, but instead of being in the libpfm format, they're perf counters as listed inperf list
.--test_description
- Can be declared multiple times if you have custom test descriptions that you want to run, but the default works well, covers a broad portion of the codebase, and has been specifically designed to minimize run to run variability.
To generate custom test descriptions for gtest executables (ie the test
executables that are used by chromium), you can use the list_gtests.py
script.
This script doesn't need to be used for running the chromium performance tests
unless you are interested in adjusting the currently set test descriptions
available in /compiler_opt/benchmark/chromium_test_descriptions
or are
interested in using tests from a different project that also uses gtest.
Example:
PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/list_gtests.py \
--gtest_executable=/path/to/executable \
--output_file=test.json \
--output_type=json
Flags:
--gtest_executable
- The path to the gtest executable from which to extract a list of tests--output_file
- The path to the file to output all of the extracted test names too--output_type
- Either JSON or default. JSON packages everything nicely into a JSON format, and default just dumps the test names separated by line breaks.
There is also a utility filter_tests.py
that allows for filtering the
individual tests available in a test executable, making sure that they exist
(sometimes tests that pop up when listing all the gtests don't run when passed
through -gtest_filter
) and that they don't fail (some tests require setups
including GUI/GPUs).
Example:
PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/filter_tests.py \
--input_tests=./compiler_pt/benchmark/chromium_test_descriptions/browser_tests.json \
--output_tests=./browser_tests_filtered.json \
--num_threads=32 \
--executable_path=/chromium/src/out/Release/browser_tests
Flags:
--input_tests
- The path to the test description generated bylist_gtests.py
to be filtered.--output_tests
- The path to where the new filtered output test suite description should be placed.--num_threads
- The number of threads to use when running tests to see if they exist/whether or not they pass.--executable_path
- The path to the gtest executable that the test suite description corresponds to.
TODO(boomanaiden154): investigate why some of the tests listed by the
executable later can't be found when using --gtest_filter
.
To compare benchmark runs, you can use the benchmark_report_converter.py
script.
Let's say you have two benchmark runs (they need to be done with the same set
of tests), baseline.json
and experimental.json
from the llvm test suite
benchmarking script with the performance counter INSTRUCTIONS
enabled. You can get
a summary comparison with the following command:
PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/benchmark_report_converter.py \
--base=baseline.json \
--exp=experimental.json \
--counters=INSTRUCTIONS \
--out=reports.csv
This will create reports.csv
with a line for each test that contains information
about the differences in performance counters for that specific test.