-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arctan avx512 #759
base: main
Are you sure you want to change the base?
Arctan avx512 #759
Conversation
Signed-off-by: Magnus Lundmark <[email protected]>
Signed-off-by: Magnus Lundmark <[email protected]>
Signed-off-by: Magnus Lundmark <[email protected]>
f268bb4
to
319387d
Compare
I just noticed there's a NaN test as well... Need to update this PR with this as well for AVX512! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution. The NaN
checks should really be integrated. Other than that, only minor comments.
Signed-off-by: Magnus Lundmark <[email protected]>
Here's some special values and a sanity check:
Do we care about the sign of atan2(-0, 0)? What about propagating nan? |
Added AVX512 kernels and some minor cleanup.
Using AVX512F yields 40% speedup over the AVX2_FMA implementation on my 7950X3D. Compared to the generic atan2 implementation this is a 65x speedup.