Cleanup atomics and fix deadlock in DP bucket_can_pool() #1127
Cleanup atomics and fix deadlock in DP bucket_can_pool() #1127bratpiorka wants to merge 0 commit intooneapi-src:mainfrom
Conversation
c570a42 to
7bd9fd3
Compare
|
@bratpiorka we have the following issue in the backlog: #886. |
6861168 to
b294cc6
Compare
5a3a327 to
1fb932d
Compare
this should be fixed in the newest version of this PR |
src/utils/utils_concurrency.h
Outdated
| #include <atomic> | ||
| #define _Atomic(X) std::atomic<X> | ||
|
|
||
| using std::memory_order_acq_rel; |
There was a problem hiding this comment.
this code was indirectly included in c++ tests..
moved to test/base.hpp
There was a problem hiding this comment.
@bratpiorka but C++ code is still there and in the test/base.hpp only Copyright is changed
There was a problem hiding this comment.
Yes, I tried moving it to base.hpp, but that broke the building of examples. I also attempted to remove the __cplusplus code entirely, but it's not that simple. So, in fact, I'm not adding new C++ code here; I'm extending the existing code. Further refactoring should be done in a separate PR.
There was a problem hiding this comment.
Yes, I tried moving it to base.hpp, but that broke the building of examples.
Why? Our examples should not use any code either from umf internals, nor tests
I also attempted to remove the __cplusplus code entirely, but it's not that simple. So, in fact, I'm not adding new C++ code here; I'm extending the existing code. Further refactoring should be done in a separate PR.
You are adding extra using, and stil we do not know why they are needed.
|
This PR is required by #1143 |
|
This PR is required by #998 as well. |
| for (nib = 0; nib <= NIB; nib++) { | ||
| if (n->child[nib]) { | ||
| struct critnib_node *m; | ||
| utils_atomic_load_acquire_ptr((void **)&n->child[nib], (void **)&m); | ||
| if (m) { | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
What is a rationale for this change. Was old code wrong / not safe?
There was a problem hiding this comment.
There was a problem hiding this comment.
@ldorau sanitizer might report false positives. Are we sure it is a real issue?
There was a problem hiding this comment.
It is still needed now: https://github.com/oneapi-src/unified-memory-framework/actions/runs/13650462662/job/38158581307
There was a problem hiding this comment.
Can anyone explain rationale why this change is needed? I need explanation why from logical point of view, of this code. That it suppress/fixes an sanitizer bug do not explain anything. From my perspective this change is like doing random changes until sanitizer is fine.
src/utils/utils_concurrency.h
Outdated
| ASSERT_IS_ALIGNED((uintptr_t)desired, 8); | ||
| ASSERT_IS_ALIGNED((uintptr_t)expected, 8); |
There was a problem hiding this comment.
Are you sure that this ptrs has to by 8b aligned?
There was a problem hiding this comment.
https://learn.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-interlockedcompareexchange64
The variables for this function must be aligned on a 64-bit boundary; otherwise, this function will behave unpredictably on multiprocessor x86 systems and any non-x86 systems
There was a problem hiding this comment.
We are dereferencing this pointers, so it do not matter what alignment, it has here, as anyway it their value will be copied to the register/stack before call.
There was a problem hiding this comment.
Are you 100% sure that the compiler will copy these arguments and not use them directly? By the way, I really don't understand why we're discussing asserts here. In my opinion, there can never be too many asserts in the code.
There was a problem hiding this comment.
We are dereferencing this pointers, so it do not matter what alignment, it has here, as anyway it their value will be copied to the register/stack before call.
As Rafal mentioned it is a requirement of the corresponding Windows API.
Furthermore, we should consider underlying HW implementation. AFAIK x86 lock instruction does not require the memory location to be aligned but there is a performance penalty if the memory location is split across multiple cache lines: in that case, instead of locking the cache line for exclusive access ht memory bus should be locked. So it is better to have aligned memory.
There was a problem hiding this comment.
Guys, this function, takes second and third argument as a Value. So this do not matter PTR is aligned or not it do not matter at all, as function will get a copy values instead of the address.
LONG64 InterlockedCompareExchange64(
[in, out] LONG64 volatile *Destination,
[in] LONG64 ExChange,
[in] LONG64 Comperand
);```
There was a problem hiding this comment.
ahh, sorry, I did not saw that your comment was not about destination. Of coarse, only destination should be aligned.
436aac8 to
7cba692
Compare
7cba692 to
ca4a8f1
Compare
src/utils/utils_concurrency.h
Outdated
| ASSERT_IS_ALIGNED((uintptr_t)ptr, 8); | ||
| ASSERT_IS_ALIGNED((uintptr_t)out, 8); | ||
| __atomic_load(ptr, out, memory_order_acquire); | ||
| //utils_annotate_acquire(ptr); |
|
@lplewa please re-review |
|
continue in #1151 |
Cleanup atomics and replace while(true) loop in Disjoint Pool bucket_can_pool() with a pair of atomic add/sub.
fix for #1125 and #1115
This PR is required by #1143