This file describes how we generate the PyTorch Benchmark Score Version 0. The goal is to help users and developers understand and be able to reproduce the score.
A complete benchmarking environment consists of three parts: the hardware environment, the environment variables and the standard config YAML.
We use an Amazon EC2 g4dn.metal instance as a self-hosted runner to run the benchmark configuration V0. Before running the benchmark, we do the a few tuning of the instance to minimize performance variance.
We disable hyperthreading on all CPUs using the following script:
for cpunum in $(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -s -d, -f2- | tr ',' '\n' | sort -un)
do
echo 0 > /sys/devices/system/cpu/cpu$cpunum/online
done
We isolate the CPU that run the benchmark by setting the following kernel parameters:
isolcpus=24-47,72-95 nohz_full=24-47,72-95
All environment variables that could affect the performance score are defined in .github/scripts/config-v0.env.
For more details, please refer to the env file.
The standard config YAML file is stored in here. It is generated by repeated runs of the same benchmark setting on pytorch v1.7.0.dev20200626, torchtext 0.8.0.dev20200626, and torchvision 0.6.1.dev20200626. The performance is manually verified to be stable across those runs. We pick a random execution of the repeated runs as the standard execution, and the standard config YAML is a summary of it.
First, the YAML defines the models that are tested in the standard execution. Below is the complete list of the models we test in V0:
- pytorch_mobilenet_v3 (succeeded by mobilenet_v3_large in v1)
- yolov3
- Background_Matting
- attention_is_all_you_need_pytorch
- BERT_pytorch
- fastNLP
- dlrm
- LearningToPaint
- moco
- demucs
- pytorch_struct
Second, the YAML defines that the performance score of the standard execution is 1000. All other V0 scores are relative to it. For example, if another benchmark execution's score is 900, it means the its performance is 10% slower comparing to the standard execution.