Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The reproduced performance is not exactly the same as in the paper #2

Open
wei-mei opened this issue Aug 30, 2023 · 1 comment
Open

Comments

@wei-mei
Copy link

wei-mei commented Aug 30, 2023

Hello, I am reading your OSDI accepted article - MGG: Accelerating Graph Neural Networks with Fine-grained Intra-kernel Communication Computation Pipelining on Multi-GPU Platforms.
I am using the git project you provided, but the performance shown in the paper is not achieved, such as Compare with DGL on 8xA100 for GCN (Fig.7a )

dataset speed up
Reddit_beg_pos 0.598862
enwiki-2013_beg_pos 0.980894
t-2004_beg_pos 2.319232
paper100M_beg_pos 3.729139
ogbn-products_beg_pos 2.551465
ogbn-proteins_beg_pos 0.655375
com-Orkut_beg_pos 5.647636

Test on SXM4 A100*8 80GB, pt-to-pt nvlink's bw = 600GB/sec

How should I adjust some configurations in your git to achieve the performance shown in the paper?

@YukeWang96
Copy link
Owner

Thanks for your interest.

  • As we mentioned in our paper evaluation ("Platforms & Tools" paragraph), the major evaluation platform is 8×A100 GPUs (40 GB) and we use AWS P4dn.24xlarge instance for evaluation.
  • For 8xA100 (80GB) due to the difference in GPU global memory bandwidth (2,039GB/s) compared to A100 (40GB) (1,555GB/s), we believe there will be additional parameter-tuning efforts for A100-80GB to achieve better performance. Some other factors like the type and the number of CPU cores of DGX-A100-80GB versus DGX-A100-40GB would also affect the performance of DGL since they rely on zero-copy access with CPU involvements for fetching remote data on the host.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants