Use PrecompileTools to warmup CUDA.jl #2325

vchuravy · 2024-04-15T17:24:59Z

No description provided.

maleadt · 2024-04-15T17:53:02Z

So IIUC it isn't worth using the actual PTX ISA or device capability here because the inference caches are shared between CUDA subtargets, and this will prime them.

I considered whether we need a mechanism to ensure this doesn't actively use the CUDA toolkit, which would prevent use on a system without a GPU, but I think CI should already cover that:

CUDA.jl/.buildkite/pipeline.yml

Lines 198 to 226 in 5da4d1d

    
             - group: ":eyes: Special" 
        
               depends_on: "cuda" 
        
               steps: 
        
                 - label: "GPU-less environment" 
        
                   plugins: 
        
                     - JuliaCI/julia#v1: 
        
                         version: "1.10" 
        
                     - JuliaCI/julia-coverage#v1: 
        
                         dirs: 
        
                           - src 
        
                           - lib 
        
                           - examples 
        
                     - JuliaCI/julia-test#v1: 
        
                         run_tests: false 
        
                   command: | 
        
                     julia --project -e ' 
        
                       using CUDA 
        
                       @assert !CUDA.functional() 
        
                       @assert !isdefined(CUDA, :libcudart) 
        
                       CUDA.set_runtime_version!(v"11.6")' 
        
                     julia --project -e ' 
        
                       using CUDA 
        
                       @assert !CUDA.functional() 
        
                       @assert isdefined(CUDA, :libcudart)' 
        
                   agents: 
        
                     queue: "juliagpu" 
        
                     intel: "*" 
        
                   if: build.message !~ /\[skip tests\]/ && build.message !~ /\[skip special\]/ && !build.pull_request.draft 
        
                   timeout_in_minutes: 5

. We should check if that actually works (e.g., by using a precompile workload that does initialize CUDA and ensure that job fails).

vchuravy · 2024-04-17T20:57:17Z

So IIUC it isn't worth using the actual PTX ISA or device capability here because the inference caches are shared between CUDA subtargets, and this will prime them.

Correct!

Using JuliaGPU/GPUCompiler.jl#557 (comment) this improved TTFK from 12s to 4s

codecov · 2024-04-19T15:29:20Z

Codecov Report

Attention: Patch coverage is 12.50000% with 7 lines in your changes missing coverage. Please review.

Project coverage is 59.96%. Comparing base (14de009) to head (c7f880c).

❗ Current head c7f880c differs from pull request most recent head 03530f0

Please upload reports for the commit 03530f0 to get more accurate results.

Files	Patch %	Lines
src/precompile.jl	12.50%	7 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #2325       +/-   ##
===========================================
- Coverage   73.37%   59.96%   -13.42%     
===========================================
  Files         157      156        -1     
  Lines       15197    14989      -208     
===========================================
- Hits        11151     8988     -2163     
- Misses       4046     6001     +1955

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/precompile.jl

maleadt · 2024-09-18T09:32:43Z

Fails on 1.11:

2024-09-18 10:44:13 CEST	ERROR: The following 1 direct dependency failed to precompile:
2024-09-18 10:44:13 CEST	
2024-09-18 10:44:13 CEST	CUDA --code-coverage=@/var/lib/buildkite-agent/builds/gpuci-7/julialang/cuda-dot-jl --color=yes --check-bounds=yes --warn-overwrite=yes --depwarn=yes --inline=yes --startup-file=no --track-allocation=none
2024-09-18 10:44:13 CEST	
2024-09-18 10:44:13 CEST	Failed to precompile CUDA [052768ef-5323-5732-b1bb-66c8b64840ba] to "/root/.cache/julia-buildkite-plugin/depots/3cc01fab-3357-4a7a-9294-cde2d3115a97/compiled/v1.11/CUDA/jl_aa67nH".
2024-09-18 10:44:13 CEST	LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.membar.sys

vchuravy requested a review from maleadt April 15, 2024 17:25

maleadt marked this pull request as draft April 16, 2024 10:46

maleadt added the enhancement New feature or request label Apr 16, 2024

vchuravy mentioned this pull request Apr 17, 2024

Add disk cache infrastructure for Julia 1.11 JuliaGPU/GPUCompiler.jl#557

Merged

3 tasks

vchuravy force-pushed the vc/precompile_tools branch from 80ec869 to c7f880c Compare April 19, 2024 14:49

vchuravy marked this pull request as ready for review April 19, 2024 14:49

vchuravy marked this pull request as draft April 19, 2024 14:50

vchuravy marked this pull request as ready for review April 19, 2024 15:58

vchuravy commented Jun 24, 2024

View reviewed changes

src/precompile.jl Outdated Show resolved Hide resolved

vchuravy force-pushed the vc/precompile_tools branch from 51520a1 to 03530f0 Compare June 24, 2024 13:56

vchuravy and others added 5 commits September 18, 2024 10:34

Use PrecompileTools to warmup CUDA.jl

84f132f

fixup! Use PrecompileTools to warmup CUDA.jl

85fc9ea

fixup! Use PrecompileTools to warmup CUDA.jl

389a1f1

try precompile tools on all versions

134c9a7

Add a note for precompile_workload

bfe2eb9

maleadt force-pushed the vc/precompile_tools branch from 03530f0 to bfe2eb9 Compare September 18, 2024 08:34

maleadt added the needs changes Changes are needed. label Sep 18, 2024

maleadt marked this pull request as draft September 18, 2024 09:32

maleadt force-pushed the master branch 8 times, most recently from 2274085 to 7ec9719 Compare December 19, 2024 17:51

maleadt force-pushed the master branch 7 times, most recently from 5d585c4 to c850163 Compare December 20, 2024 08:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use PrecompileTools to warmup CUDA.jl #2325

Use PrecompileTools to warmup CUDA.jl #2325

vchuravy commented Apr 15, 2024

maleadt commented Apr 15, 2024

vchuravy commented Apr 17, 2024

codecov bot commented Apr 19, 2024 •

edited

Loading

maleadt commented Sep 18, 2024

Use PrecompileTools to warmup CUDA.jl #2325

Are you sure you want to change the base?

Use PrecompileTools to warmup CUDA.jl #2325

Conversation

vchuravy commented Apr 15, 2024

maleadt commented Apr 15, 2024

vchuravy commented Apr 17, 2024

codecov bot commented Apr 19, 2024 • edited Loading

Codecov Report

maleadt commented Sep 18, 2024

codecov bot commented Apr 19, 2024 •

edited

Loading