-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
skip ba_global_free when BA is already destroyed #939
skip ba_global_free when BA is already destroyed #939
Conversation
6aad171
to
db92ea0
Compare
Do we have reproducer for such scenario? Should we add it as a test? |
src/base_alloc/base_alloc_global.h
Outdated
@@ -17,6 +19,7 @@ extern "C" { | |||
void *umf_ba_global_alloc(size_t size); | |||
void umf_ba_global_free(void *ptr); | |||
void umf_ba_destroy_global(void); | |||
bool umf_ba_global_is_destroyed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where it is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right now nowhere but it would be needed when Disjoint Pool would use base alloc (in C version) in umfDisjointPoolParamsDestroy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just wondering if it is the right way to fix it. See my PR #940, will the same approach for disjoint_pool (calling libumfInit
in the right place) resolve your issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this helps - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should we close this PR?
db92ea0
to
70136eb
Compare
70136eb
to
2aad2ba
Compare
this bug also occurs here: https://github.com/oneapi-src/unified-memory-framework/actions/runs/12041220655/job/33590212027 In the CUDA provider test (after params refactoring) we are calling umfCUDAMemoryProviderParamsDestroy in the destructor of global objects: cuParams_device_memory, cuParams_shared_memory and cuParams_host_memory. umfCUDAMemoryProviderParamsDestroy calls BA free which dereferences already unmapped pointer |
Oh, I see. I saw the similar issue with L0 provider params when working on it and fixed it. But the corresponding CUDA PR was already merged and I forgot to apply the same fix for CUDA params. Here is the PR #940 that should fix this failure. The issue is that |
Skip ba_global_free when BA is already destroyed.
This could happen when the globally allocated object with a destructor that calls any UMF free/destroy function is called after UMF tear down and base alloc destroy. Then, any dereferencing of previously allocated (and mmaped) by base alloc pointer would cause a crash (as the pointer is already unmapped).