-
Notifications
You must be signed in to change notification settings - Fork 118
parallel snstop #420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
parallel snstop #420
Conversation
* parallel snstop * formatting fixes * added comments * added test - does it make sense? * fixed MPI check * actually fixing MPI check * iSort fix * maybe this time the test will be skipped? * maybe like this? * what about this * cleanup * add timeout option to testflo * rerun tests * updated test with send/receive --------- Co-authored-by: Marco Mangano <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #420 +/- ##
==========================================
+ Coverage 86.22% 86.41% +0.19%
==========================================
Files 24 24
Lines 3418 3438 +20
==========================================
+ Hits 2947 2971 +24
+ Misses 471 467 -4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ewu63 bumping this up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it is finally working. Is 120s a reasonable timeout timeout for tests?
My main concern with this PR is that I think the behaviour of whether to call snSTOP on only the root proc or all procs should be a user configurable option, and I would prefer the default being the prior behaviour (i.e. only on the root proc). |
I see your point @ewu63 . Would that be as simple as adding an additional bool option (say The rest of the edits in the |
Yeah, something like that. The rest of the changes look OK. |
I thought about this PR some more and I have some more reservations
|
I see. About the second point though, that would only involve the testing script. The rest of the changes are just ineffective if the code is not run in parallel. |
Yes, I misread that part of the code I think that's all OK. I can try to address some of this soon but I would prefer holding off on this PR for a bit longer. Trying to be diligent to not break people's stuff downstream. |
Agreed, there is no rush. I can add that flag we mentioned soon, so that the default behavior is unchanged. Down to brainstorm additional tests. |
Purpose
The purpose of this PR is to parallelize the SNSTOP user callback function.
See #417 for commit history.
A
--timeout
option has been added to thetestflo
arguments to avoid tests to hang.Expected time until merged
1 month
Type of change
Testing
Tests were added to ensure other processors are calling the snstop function.
Checklist
flake8
andblack
to make sure the Python code adheres to PEP-8 and is consistently formattedfprettify
or C/C++ code withclang-format
as applicable