You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I am reading your OSDI accepted article - MGG: Accelerating Graph Neural Networks with Fine-grained Intra-kernel Communication Computation Pipelining on Multi-GPU Platforms.
I am using the git project you provided, but the performance shown in the paper is not achieved, such as Compare with DGL on 8xA100 for GCN (Fig.7a )
dataset
speed up
Reddit_beg_pos
0.598862
enwiki-2013_beg_pos
0.980894
t-2004_beg_pos
2.319232
paper100M_beg_pos
3.729139
ogbn-products_beg_pos
2.551465
ogbn-proteins_beg_pos
0.655375
com-Orkut_beg_pos
5.647636
Test on SXM4 A100*8 80GB, pt-to-pt nvlink's bw = 600GB/sec
How should I adjust some configurations in your git to achieve the performance shown in the paper?
The text was updated successfully, but these errors were encountered:
As we mentioned in our paper evaluation ("Platforms & Tools" paragraph), the major evaluation platform is 8×A100 GPUs (40 GB) and we use AWS P4dn.24xlarge instance for evaluation.
For 8xA100 (80GB) due to the difference in GPU global memory bandwidth (2,039GB/s) compared to A100 (40GB) (1,555GB/s), we believe there will be additional parameter-tuning efforts for A100-80GB to achieve better performance. Some other factors like the type and the number of CPU cores of DGX-A100-80GB versus DGX-A100-40GB would also affect the performance of DGL since they rely on zero-copy access with CPU involvements for fetching remote data on the host.
Hello, I am reading your OSDI accepted article - MGG: Accelerating Graph Neural Networks with Fine-grained Intra-kernel Communication Computation Pipelining on Multi-GPU Platforms.
I am using the git project you provided, but the performance shown in the paper is not achieved, such as Compare with DGL on 8xA100 for GCN (Fig.7a )
Test on SXM4 A100*8 80GB, pt-to-pt nvlink's bw = 600GB/sec
How should I adjust some configurations in your git to achieve the performance shown in the paper?
The text was updated successfully, but these errors were encountered: