-
Notifications
You must be signed in to change notification settings - Fork 912
COLL/UCC: add persistent collective calls for UCC #13374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
COLL/UCC: add persistent collective calls for UCC #13374
Conversation
Signed-off-by: hasegawa.kento <[email protected]>
Signed-off-by: hasegawa.kento <[email protected]>
Signed-off-by: hasegawa.kento <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have few nitpick comments for the entire commit:
- 'iniz' has little meaning. How about init_common instead ?
- 'a' as a siffix has no meaning while ' alias' does !
- in the original code when the call to the UCC collective _init was made it was multiline with clear separation between send and receive argument. Please maintain that structure, and add the persistent bool flag in front of the ucc_module.
@@ -592,6 +648,19 @@ OBJ_CLASS_INSTANCE(mca_coll_ucc_req_t, ompi_request_t, | |||
|
|||
int mca_coll_ucc_req_free(struct ompi_request_t **ompi_req) | |||
{ | |||
if (MPI_REQUEST_NULL != ompi_req[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mostly certain that this function cannot be called with MPI_REQUEST_NULL, as MPI_REQUEST_NULL is a static object and cannot be added on any free_list. Did you see it happening ? If not please remove this protection.
UCC_ERROR("ucc_collective_post failed: %s", ucc_status_string(rc_ucc)); | ||
coll_req->super.req_complete = REQUEST_COMPLETED; | ||
coll_req->super.req_state = OMPI_REQUEST_INACTIVE; | ||
if (OMPI_SUCCESS == rc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please ensure that coll_req->super.req_status.MPI_ERROR
reflect the rc
error.
What
In addition to the existing blocking and non-blocking collective calls,
this PR adds persistent collective call in COLL/UCC.
For example,
How
This adds <coll>_init function for each collective operation,
using UCC_COLL_ARGS_FLAG_PERSISTENT flag [1].
Reference
[1] "Unified Collective Communications (UCC) Specification Version 1.0" (2022/02/18)
https://openucx.github.io/ucc/api/v1.0/pdf/ucc.pdf
- UCC_COLL_ARGS_FLAG_PERSISTENT in Section 8.8.4.2 "ucc_coll_args_flags_t"