Skip to content

Benchmarks across Deep Learning Frameworks in Julia and Python

Notifications You must be signed in to change notification settings

avik-pal/DeepLearningBenchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

Popular Computer Vision Model Benchmarks

Input Dimensions

  1. Batch Size = 8, Image = 3 x 224 x 224 (IF NOTHING SPECIFIED / CPU USED)
  2. Batch Size = 4, Image = 3 x 224 x 224
    • Resnet 101
    • Resnet 152

GPU USED --- Titan 1080Ti 12 GB

Model Framework Forward Pass Backward Pass Total Time Inference
VGG16 Pytorch 0.4.1 0.0245 s 0.0606 s 0.0852 s 0.0234 s
Flux 0.6.8+ 0.0287 s 0.0760 s 0.1047 s 0.0288 s
VGG16 BN Pytorch 0.4.1 0.0271 s 0.0672 s 0.0943 s 0.0273 s
Flux 0.6.8+ 0.0333 s 0.0818 s 0.1151 s 0.0327 s
VGG19 Pytorch 0.4.1 0.0281 s 0.0741 s 0.1021 s 0.0280 s
Flux 0.6.8+ 0.0355 s 0.0923 s 0.1278 s 0.0356 s
VGG19 BN Pytorch 0.4.1 0.0321 s 0.0812 s 0.1134 s 0.0325 s
Flux 0.6.8+ 0.0377 s 0.0965 s 0.1342 s 0.0371 s
Resnet18 Pytorch 0.4.1 0.0064 s 0.0125 s 0.0190 s 0.0050 s
Flux 0.6.8+ 0.0079 s 0.0218 s 0.0297 s 0.0079 s
Resnet34 Pytorch 0.4.1 0.0092 s 0.0216 s 0.0307 s 0.0092 s
Flux 0.6.8+ 0.0137 s 0.0313 s 0.0450 s 0.0151 s
Resnet50 Pytorch 0.4.1 0.0155 s 0.0351 s 0.0506 s 0.0152 s
Flux 0.6.8+ 0.0205 s 0.1795 s 0.2000 s -
Resnet101 Pytorch 0.4.1 0.0297 s 0.0379 s 0.0676 s 0.0298 s
Flux 0.6.8+ 0.0215 s 0.0616 s 0.0831 s 0.0208 s
Resnet152 Pytorch 0.4.1 0.0431 s 0.05337 s 0.0965 s 0.0429 s
Flux 0.6.8+ 0.0308 s 0.0807 s 0.1115 s 0.0298 s

CPU USED --- Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz

Model Framework Forward Pass Backward Pass Total Time Inference
VGG16 Pytorch 0.4.1 6.6024 s 9.4336 s 16.036 s 6.4216 s
Flux 0.6.8+ 10.458 s 10.245 s 20.703 s 10.111 s
VGG16 BN Pytorch 0.4.1 7.0793 s 9.0536 s 16.132 s 6.7909 s
Flux 0.6.8+ 29.633 s 18.649 s 49.282 s 24.047 s
VGG19 Pytorch 0.4.1 8.3075 s 10.899 s 19.207 s 8.0593 s
Flux 0.6.8+ 12.226 s 12.457 s 24.683 s 12.029 s
VGG19 BN Pytorch 0.4.1 8.7794 s 12.739 s 21.519 s 8.4044 s
Flux 0.6.8+ 28.518 s 21.464 s 49.982 s 22.649 s

Individual Layer Benchmarks

Layer Descriptions

  1. Conv3x3/1 = Conv2d, 3x3 Kernel, 1x1 Padding, 1x1 Stride
  2. Conv5x5/1 = Conv2d, 5x5 Kernel, 2x2 Padding, 1x1 Stride
  3. Conv3x3/2 = Conv2d, 3x3 Kernel, 1x1 Padding, 2x2 Stride
  4. Conv5x5/2 = Conv2d, 5x5 Kernel, 2x2 Padding, 2x2 Stride
  5. Dense = 1024 => 512
  6. BatchNorm = BatchNorm2d

GPU USED --- Titan 1080Ti 12 GB

Layer Framework Forward Pass Backward Pass Total Time
Conv3x3/1 Pytorch 0.4.1 0.2312 ms 0.5359 ms 0.7736 ms
Flux 0.6.8+ 0.1984 ms 0.7640 ms 0.9624 ms
Conv5x5/1 Pytorch 0.4.1 0.2667 ms 0.5345 ms 0.8299 ms
Flux 0.6.8+ 0.2065 ms 0.8075 ms 1.014 ms
Conv3x3/2 Pytorch 0.4.1 0.1170 ms 0.2203 ms 0.3376 ms
Flux 0.6.8+ 0.0927 ms 0.5988 ms 0.6915 ms
Conv5x5/2 Pytorch 0.4.1 0.1233 ms 0.2162 ms 0.3407 ms
Flux 0.6.8+ 0.0941 ms 0.6515 ms 0.7456 ms
Dense Pytorch 0.4.1 0.0887 ms 0.1523 ms 0.2411 ms
Flux 0.6.8+ 0.0432 ms 0.2044 ms 0.2476 ms
BatchNorm Pytorch 0.4.1 0.1096 ms 0.1999 ms 0.3095 ms
Flux 0.6.8+ 0.2211 ms 0.2849 ms 0.5060 ms

NOTE

To reproduce the benchmarks checkout Flux 0.6.8+ avik-pal/cudnn_batchnorm and CuArrays master. Since the Batchnorm GPU is broken for Flux 0.6.8+ master so we cannot perform the benchmarks using that.

About

Benchmarks across Deep Learning Frameworks in Julia and Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published