Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update prepare_build_environment_windows.sh #1830

Closed
wants to merge 1 commit into from

Conversation

BBC-Esq
Copy link

@BBC-Esq BBC-Esq commented Dec 9, 2024

There was a mistaken path to CUDA 12.2, which is causing compatibility issues:

1) torch only has the following prebuilt wheels for CUDA support:

Pytorch Wheel PyTorch Versions Supported
cu124 2.5.1, 2.5.0, 2.4.1, 2.4.0
cu121 2.5.1, 2.5.0, 2.4.1, 2.4.0, 2.3.1, 2.3.0, 2.2.2...2.1.0

2) Based on the torch version, the following will be bundled for Linux:

  • triton, mkl, and sympy being dependencies for all platforms:
Torch cuda-nvrtc-cu12 cuda-runtime-cu12 cublas-cu12 cudnn-cu12 triton mkl sympy
2.5.1 12.4.127 12.4.127 12.4.5.8 9.1.0.70 3.1.0 - 1.13.1
2.5.0 12.4.127 12.4.127 12.4.5.8 9.1.0.70 3.1.0 - 1.13.1
2.4.1 12.1.105 12.1.105 12.1.3.1 9.1.0.70 3.0.0 - -
2.4.0 12.1.105 12.1.105 12.1.3.1 9.1.0.70 3.0.0 - -
2.3.1 12.1.105 12.1.105 12.1.3.1 8.9.2.26 2.3.1 <=2021.4.0 -
2.3.0 12.1.105 12.1.105 12.1.3.1 8.9.2.26 2.3.0 <=2021.4.0 -
2.2.2 12.1.105 12.1.105 12.1.3.1 8.9.2.26 2.2.0 - -
  • 12.1.105 and 12.1.3.1 - ALL COME FROM CUDA release 12.1.1
  • 12.4.127 and 12.4.5.8 - ALL COME FROM CUDA release 12.4.1
  • In other words, torch is NOT 100% compatible with CUDA 12.1.0 or 12.4.0, for example, or any other version.

CTRANSLATE2 currently uses CUDA 12.2 for which torch doesn't build wheels for!

3) CUDA and cuDNN compatibility is flexible and primarily affects whether static linking is available:

cuDNN Version Static Linking No Static Linking
8.9.2 12.1, 11.8 12.0, ≤11.7
8.9.3 12.1, 11.8 12.0, ≤11.7
8.9.4 12.2, 11.8 12.1, 12.0, ≤11.7
8.9.5 12.2, 11.8 12.1, 12.0, ≤11.7
8.9.6 12.2, 11.8 12.1, 12.0, ≤11.7
8.9.7 12.2, 11.8 12.1, 12.0, ≤11.7
9.0.0 12.3, 11.8 12.2, 12.1, 12.0, ≤11.7
9.1.0 12.4-12.0, 11.8 ≤11.7
9.1.1 12.5-12.0, 11.8 ≤11.7

4) HOWEVER, despite this compatibility, TORCH presents its own complications:

PyTorch Version Python Stable Experimental
2.5 >=3.9, <=3.12, (3.13 exp.) CUDA 11.8, CUDA 12.1, CUDA 12.4, CUDNN 9.1.0.70 None
2.4 >=3.8, <=3.12 CUDA 11.8, CUDA 12.1, CUDNN 9.1.0.70 CUDA 12.4, CUDNN 9.1.0.70
2.3 >=3.8, <=3.11, (3.12 exp.) CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
2.2 >=3.8, <=3.11, (3.12 exp.) CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26

This is outlined here: https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix

5) The problem...

Ctranslate2 currently installs CUDA libraries originating from release 12.2, which results in the following being installed:
nvidia-cuda-runtime-cu12==12.2.140
nvidia-cublas-cu12==12.2.5.6
nvidia-cuda-nvcc-cu12==12.2.140
nvidia-cuda-nvrtc-cu12==12.2.140

These do not match any of the compatible libraries that torch bundles with its wheels - i.e. that are compatible. Again, torch is, apparently, very specific about which CUDA libraries it supports. For example, these libraries are even different between CUDA releases 12.4.0 (which torch doesn't support) and 12.4.1 (which torch bundles with all its wheels).

6) Solution...

Make ctranslate2 use the libraries from CUDA 12.4.1 (not 12.4.0 even) to maximize compatibility.

7) Additional reasons...

Other libraries can be highly dependent on compatibility with torch and/or CUDA versions. For example:


Xformers Compatibility


Xformers Version Torch Version
v0.0.28.post3 2.5.1
v0.0.28.post2 2.5.0
v0.0.28.post1 2.4.1
v0.0.27.post2 2.4.0
v0.0.27.post1 2.4.0
v0.0.27 2.3.0
v0.0.26.post1 2.3.0
v0.0.25.post1 2.2.2
  • Windows prebuilt wheels on PyPI through 2.4.0 (i.e., pip installable)
  • Windows prebuilt wheels after 2.4.0 only available from PyTorch:

LINUX Flash Attention 2 Compatibility


FA2 Version Torch Versions Supported CUDA Versions
v2.7.1.post4 2.2.2, 2.3.1, 2.4.0, 2.5.1, 2.6.0.dev20241001 11.8.0, 12.3.2
v2.7.1.post3 2.2.2, 2.3.1, 2.4.0, 2.5.1, 2.6.0.dev20241001 11.8.0, 12.3.2
v2.7.1.post2 2.2.2, 2.3.1, 2.4.0, 2.5.1, 2.6.0.dev20241001 11.8.0, 12.3.2
v2.7.1.post1 2.2.2, 2.3.1, 2.4.0, 2.5.1, 2.6.0.dev20241010 11.8.0, 12.4.1
v2.7.1 2.2.2, 2.3.1, 2.4.0, 2.5.1, 2.6.0.dev20241010 11.8.0, 12.4.1
v2.7.0.post2 2.2.2, 2.3.1, 2.4.0, 2.5.1 11.8.0, 12.4.1
v2.7.0.post1 2.2.2, 2.3.1, 2.4.0, 2.5.1 11.8.0, 12.4.1
v2.7.0 2.2.2, 2.3.1, 2.4.0, 2.5.1 11.8.0, 12.3.2
v2.6.3* 2.2.2, 2.3.1, 2.4.0 11.8.0, 12.3.2
v2.6.2 2.2.2, 2.3.1, 2.4.0.dev20240527 11.8.0, 12.3.2
v2.6.1 2.2.2, 2.3.1, 2.4.0.dev20240514 11.8.0, 12.3.2
v2.6.0.post1 2.2.2, 2.3.1, 2.4.0.dev20240514 11.8.0, 12.2.2
v2.6.0 2.2.2, 2.3.1, 2.4.0.dev20240512 11.8.0, 12.2.2
v2.5.9.post1 2.2.2, 2.3.0, 2.4.0.dev20240407 11.8.0, 12.2.2
  • 2.5.8 is the first to support torch 2.2.2
  • No prebuilt wheels simultaneously support torch 2.2.2 and CUDA prior to 12.2.2

Regarding triton, which torch requires...triton==3.0.0 only supports up to Python 3.11. triton==3.1.0, on the other hand, supports Python 3.12.

Conclusion

Using CUDA libraries related to release 12.4.1 maximizes the compatibility with torch and other popular libraries like flash attention 2, xformers, triton (a requirement of torch now), not to mention many others.

@BBC-Esq
Copy link
Author

BBC-Esq commented Dec 10, 2024

Can you rerun this please? Apparently, it errored because there was a connection error downloading a particular helskini model from huggingface.

@BBC-Esq BBC-Esq closed this by deleting the head repository Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant