PyTorch Benchmark Score V1

This file describes how we generate the PyTorch Benchmark Score Version 1. The goal is to help users and developers understand the score and be able to reproduce it.

V1 uses the same hardware environment as V0, but it covers far more models and test configurations.

Requirements

The V1 benchmark suite uses an experimental JIT feature, optimize_for_inference, introduced on May 22, 2021. Therefore, it can't run on earlier versions of PyTorch.

Coverage

The V1 suite covers 50 models from popular machine learning domains. The complete list of models is as follows:

Model name	Category
BERT_pytorch	NLP
Background_Matting	COMPUTER VISION
LearningToPaint	REINFORCEMENT LEARNING
alexnet	COMPUTER VISION
attention_is_all_you_need_pytorch	NLP
demucs	OTHER
densenet121	COMPUTER VISION
dlrm	RECOMMENDATION
drq	REINFORCEMENT LEARNING
fastNLP	NLP
hf_Albert	NLP
hf_Bert	NLP
hf_BigBird	NLP
hf_DistilBert	NLP
hf_GPT2	NLP
hf_Longformer	NLP
hf_T5	NLP
maml	OTHER
maml_omniglot	OTHER
mnasnet1_0	COMPUTER VISION
mobilenet_v2	COMPUTER VISION
mobilenet_v3_large	COMPUTER VISION
moco	OTHER
nvidia_deeprecommender	RECOMMENDATION
opacus_cifar10	OTHER
pyhpc_equation_of_state	OTHER
pyhpc_isoneutral_mixing	OTHER
pytorch_CycleGAN_and_pix2pix	COMPUTER VISION
pytorch_stargan	COMPUTER VISION
pytorch_struct	OTHER
resnet18	COMPUTER VISION
resnet50	COMPUTER VISION
resnet50_quantized_qat	COMPUTER VISION
resnext50_32x4d	COMPUTER VISION
shufflenet_v2_x1_0	COMPUTER VISION
soft_actor_critic	REINFORCEMENT LEAERNING
speech_transformer	SPEECH
squeezenet1_1	COMPUTER VISION
timm_efficientnet	COMPUTER VISION
timm_nfnet	COMPUTER VISION
timm_regnet	COMPUTER VISION
timm_resnest	COMPUTER VISION
timm_vision_transformer	COMPUTER VISION
timm_vovnet	COMPUTER VISION
tts_angular	SPEECH
vgg16	COMPUTER VISION
yolov3	COMPUTER VISION

Reference Config YAML

The reference config YAML file is stored here. It is generated by repeated runs of the same benchmark setting on pytorch v1.10.0.dev20210612, torchtext 0.10.0.dev20210612, and torchvision 0.11.0.dev20210612. We choose the earliest PyTorch nightly version that has a stable implementation of the optimize_for_inference feature. We then picked a random execution of the repeated V1 benchmark runs as the reference execution, and summarize its performance metrics in the reference config YAML.

We have also manually verified that the maximum variance of any single test in the V1 suite is smaller than 7%. In the V1 nightly CI job, we raise signal if any tests performance metric changes over the 7% threshold, or the overall score number changes over 1% threshold.

We define the V1 score value of the referenece execution to be 1000. All other V1 scores are relative to the performance of the reference execution. For example, if another V1 benchmark execution's score is 900, it means the its performance is 10% slower comparing to the reference execution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config-v1.md

config-v1.md

PyTorch Benchmark Score V1

Requirements

Coverage

Reference Config YAML

Files

config-v1.md

Latest commit

History

config-v1.md

File metadata and controls

PyTorch Benchmark Score V1

Requirements

Coverage

Reference Config YAML