-
Notifications
You must be signed in to change notification settings - Fork 487
UCT/MM: Fix the FIFO room calculation for tail > head #10823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/azp run |
Azure Pipelines successfully started running 5 pipeline(s). |
/azp run perf |
Azure Pipelines successfully started running 1 pipeline(s). |
seems ignoring the ARM bit and casting the difference to integer and then doing the comparison is performing better. New macro is given below #define FIX2_UCT_MM_EP_IS_ABLE_TO_SEND(_head, _tail, _fifo_size) The attached test.c.txt compares the performance of the macro as well as the correctness please check. I will upload a new diff. [ |
Want me to give this a try on my setup? |
|
This seems to fix the issue for me 👍 It is still unclear to me how the tail can end up greater than the head without any wraparound occurring, but it does work. Thanks! cc @hjelmn |
@jakemoroni Thank you so much for the verification. I have explained in this reply #10761 (comment) about, how the sender's view of receiver_tail can overtake the sender's view of receiver_head (tail from receiver's view becomes bigger than head only in case of wraparound. head and tail are variables in the receiver and shared to every other rank(sender) via shared memory). Please have a look at the provided code comments. |
Can you squish this down the minimal number of commits? I can take a look and review once that is complete. I ask because this seems like it fixes another commit on this PR: "TEST/MM: Fix compile error" |
Signed-off-by: Arun Chandran <[email protected]>
754e311
to
c4b647f
Compare
I have squashed the commits, please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Scanning through the code I see UCT_MM_IFACE_FIFO_HEAD_EVENT_ARMED is only ever used with the head so the original math was definitely wrong. We also tested the fix on one of our systems and it fixes the issue.
@yosefe Please merge. |
* increments once every 1 ns. On this timescale, the signed63 midpoint (2^62) | ||
* is ~4.61e18 ticks (~146 years). Over 5 years, head would advance by | ||
* ~1.5768e17 ticks (~3.4% of that midpoint), which is far from any wraparound | ||
* edge case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can specify in comment that this code cannot work in case of wrap around (as it originally was). Or maybe that only case where head < tail is when head is stale and in this case we intend to return 0?
To summarize my understanding for later referral. With two writers A and B, A is slow, B is very fast:
So |
check_case(c); | ||
} | ||
|
||
} No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: add newline at end of file
What?
Fix for #10761