-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce unrolling in Panama dotProduct float variant #14071
base: main
Are you sure you want to change the base?
Conversation
Q: were you able to confirm what happens on ARM ? |
yeah, I do have some ARM results. I'll post them shortly. |
You have rocket lake right? with only 2 fma units? so the 2x may work fine for you because of that. I haven't looked at the assembly, but if the jvm is unrolling 4x, we shouldnt need to unroll at all to keep your CPU busy. Last time i checked, it didnt do this. I can run the script in the repo against various aws instances so that we are sure. |
did a quick run: looks ok, but it hurts the haswell, zen2, and zen3. I'll do a pass on the instance types, since they are a bit outdated and try to bring them up to speed (e.g. no graviton4 represented). cascadelake: main
patch
graviton2: main
patch
graviton3: main
patch
haswell: main
patch
icelake: main
patch
sapphirerapids: main
patch
zen2: main
patch
zen3: main
patch
zen4: main
patch
|
For the graviton4 it also hurts. I'll open a PR adding that one to the instance type list. graviton4: main
patch
|
See #14072 PR with benchie fixes, if you want to reproduce or play with it. We at least need to fix the hardcoded lucene 10 version so that benchie works again :) |
Reduce unrolling in Panama dotProduct float variants.