-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add device info query to report support for native asserts. #2269
base: main
Are you sure you want to change the base?
Add device info query to report support for native asserts. #2269
Conversation
14acfa8
to
b38264f
Compare
b38264f
to
d5f06df
Compare
LLVM PR intel/llvm#15929 |
source/adapters/hip/device.cpp
Outdated
@@ -933,6 +928,14 @@ UR_APIEXPORT ur_result_t UR_APICALL urDeviceGetInfo(ur_device_handle_t hDevice, | |||
} | |||
case UR_DEVICE_INFO_COMMAND_BUFFER_EVENT_SUPPORT_EXP: | |||
return ReturnValue(false); | |||
case UR_DEVICE_INFO_USE_NATIVE_ASSERT: | |||
// TODO: Remove comment when HIP supports native asserts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HIP supports native asserts of devices that are most relevant for the users of dpc++ AFAIK: the assert tests definitely pass on CDNA2 cards, and AMD devs implied they should be supported on all supported devices. There was a failure on a RDNA2 card but that has little relevance to dpc++ users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they still have UNSUPPORTED: hip
in the e2e test along with a tracker issue, I could change this to a note about it not necessarily being 100% supported and link the issue, or I'm happy to just delete the comment if you don't think it's useful to track that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they still have
UNSUPPORTED: hip
in the e2e test along with a tracker issue, I could change this to a note about it not necessarily being 100% supported and link the issue, or I'm happy to just delete the comment if you don't think it's useful to track that here
Sure, whatever you think is best. The only issue with the current comment is that it is all out of date/ factually incorrect.
As far as hip tests marked unsupported in intel/llvm, for a lot of cases this is because the intel/llvm CI does not use an officially supported rocm card. This case is a bit different because the failure still does occur on some officially supported cards in addition to the unsupported ones but I think your wish to add a comment to state something along the lines of "this is mostly supported" is fine.
It is best to try not to take too much account of such failures, unfortunately we have to rely on other testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to omit the comment, looking at intel/llvm#7634 it does seem like the remaining fail there is a weird issue with how the hardware is set up, and based on Georgi's comment it sounds like confidence is pretty high that this should be upgraded to an unqualified "supported"
@aarongreig According to my latest testing a CDNA2 AMD GPU had reliable native asserts support with latest versions of ROCm. However, I am not sure how well things are with consumer grade ones, but I suspect all supported cards should in theory have support for the feature these days. Just a question/thought on how this ties to DPC++, but not a requirement: |
d5f06df
to
df32785
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for cuda/hip
That probably does need to be updated, although I don't really want to include that in intel/llvm#15929 since it's supposed to just be updating how the internal reporting mechanism works |
Yeah, I agree. Guess the comment was more of a note to refer to in a future follow-up to that specific part of intel/llvm. Regardless, all cuda/hip related changes look good! Cheers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for L0
This allows cuda and hip to stop reporting the relevant opencl extension string, see issue oneapi-src#1374
df32785
to
6a711b3
Compare
ping @oneapi-src/unified-runtime-native-cpu-write @oneapi-src/unified-runtime-opencl-write |
This allows cuda and hip to stop reporting the opencl extension string, see issue #1374