-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/CPU Detection for Apple M1 #40876
Comments
How hard would fixing this be? Would I be able to do it? I imagine it's adding some hardcoded options to here: Lines 1184 to 1308 in a08a3ff
It could be possible to do it programmatically using developer.apple.com/documentation/kernel/1387446-sysctlbyname but that would necessitate a refactor of the code since I think it just expects linux code for now. |
@chriselrod I presume this was fixed by #41924? julia> versioninfo()
Julia Version 1.9.0-DEV.332
Commit 559244b383* (2022-04-06 16:01 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.4.0)
CPU: 8 × Apple M1
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
Threads: 1 on 4 virtual cores
julia> @code_native Threads.atomic_add!(Threads.Atomic{Int}(1), 2) .section __TEXT,__text,regular,pure_instructions
.build_version macos, 12, 0
.globl "_julia_atomic_add!_11581" ; -- Begin function julia_atomic_add!_11581
.p2align 2
"_julia_atomic_add!_11581": ; @"julia_atomic_add!_11581"
; ┌ @ atomics.jl:405 within `atomic_add!`
.cfi_startproc
; %bb.0: ; %top
ldaddal x1, x0, [x0]
ret
.cfi_endproc
; └
; -- End function
.subsections_via_symbols |
Originally posted here.
The Apple M1 supports ARMv8.4-A, but Julia/LLVM treats it like an A7/Cyclone CPU:
Which is ARMv8-a. Although the page on the A14 claims it is ARMv8.5-a. for the firestorm/icestorm cores.
As such, atomics are implemented using a load link/conditional store loop:
However, if I start Julia with
-C'armv8.4-a'
:Starting Julia without
-C
flags:With
-C'armv8.4-a'
:I made non-x86 architectures (including the M1) ramp up thread use more slowly, because earlier performance tests suggested the M1 had higher threading overhead. Maybe that was partly because of atomics, and partly because of the lack of a shared L3 cache, and of course maybe for other reasons I don't know.
There's of course more than just atomics separating armv8.(4/5)-a and armv8.
The text was updated successfully, but these errors were encountered: