Skip to content

CUDA implementation of the Blocked Floyd Warshall All pairs shortest path graph algorithm

License

Notifications You must be signed in to change notification settings

MTB90/cuda-floyd_warshall

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cuda Floyd Warshall implementation

CUDA implementation of the Blocked Floyd-Warshall All pairs shortest path graph algorithm based on article: "A Multi-Stage CUDA Kernel for Floyd-Warshall" (Ben Lund, Justin W. Smith)

Tested:

Hardware (MSI GP72 7RE):
  1. Processor: Intel(R) Core(TM) i7-7700HQ
  2. GPU: GTX 1050 Ti 2GB RAM
  3. RAM: 8GB RAM
Environment:
  1. System: Ubuntu 17.10
  2. NVCC: 9.1
  3. CC: 6.1

Performance results:

|V||E|fw.cpp fw-cuda.cu Speedup blocked-fw-cuda.cu Speedup
1000 150000 0.724s 0.176s 4.215x 0.117s 6.188x
2000 400000 5.895s 0.642s 9.182x 0.213s 27.67x
5000 500000 84.91s 8.406s 10.10x 2.301s 36.90x
7500 1125000 279.0s 28.31s 9.855x 7.550s 36.95x
10000 2000000 665.9s 63.81s 10.43x 17.28s 38.53x
12500 3125000 1269s 125.5s 10.11x 34.03s 37.27x

Compile source:

  1. Install make/nvcc
  2. Update makefile with path to nvcc compiler
  3. Run command: make all

Run tests:

  1. Install Python 3.6
  2. Go to project directory
  3. Run command: python3 -m unittest discover -s test -v

Author: Mateusz Bojanowski

About

CUDA implementation of the Blocked Floyd Warshall All pairs shortest path graph algorithm

Resources

License

Stars

Watchers

Forks

Packages

No packages published