-
Notifications
You must be signed in to change notification settings - Fork 937
nbc: rework the way to add arrays to clean up #13529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
nbc: rework the way to add arrays to clean up #13529
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request reworks the memory management for auxiliary arrays in non-blocking and persistent collective MPI operations. The changes address a critical bug where temporary arrays created during Fortran-to-C conversions were not being properly cleaned up, except in limited cases detected by the C interface. The PR introduces new API functions to encapsulate array cleanup registration and hides the internal release_arrays structure, replacing direct array manipulation with function calls across all Fortran MPI bindings.
Key Changes
- Introduced
ompi_coll_base_append_array_to_release()andompi_coll_base_run_release_arrays_cb()functions to manage array cleanup - Replaced direct manipulation of
nb_request->data.release_arrayswith the new API functions across mpif-h and use-mpi-f08 bindings - Fixed uninitialized variable (
tmp_rdispls) and removed unusedidxvariables in several files
Reviewed changes
Copilot reviewed 39 out of 39 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| ompi/mca/coll/base/coll_base_util.h | Adds new inline function ompi_coll_base_append_array_to_release() and declares ompi_coll_base_run_release_arrays_cb() |
| ompi/mca/coll/base/coll_base_util.c | Implements ompi_coll_base_run_release_arrays_cb() to set callbacks for array cleanup, updates comment to reflect broader usage |
| ompi/mpi/fortran/use-mpi-f08/base/bigcount.h | Updates macro to use new ompi_coll_base_append_array_to_release() API instead of direct array access |
| ompi/mpi/fortran/use-mpi-f08/*.c.in | Multiple files updated to use new API functions for registering arrays, adds calls to ompi_coll_base_run_release_arrays_cb(), fixes variable initialization |
| ompi/mpi/fortran/mpif-h/*.c | Multiple files updated to replace direct nb_request->data.release_arrays manipulation with new API calls, removes unused idx variables |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
jsquyres
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit to trying the copilot review just for the heck of it. I'm surprised that the (mostly good) quality of the copilot review! I'm also pleased that it correctly labeled all the "nitpick" issues (i.e., it clearly calls out when it's just being nitpicky).
Turns out that in the course of working on PR open-mpi#13280 , it was discovered that auxiliary arrays associated with a non-blocking/persistent collective requests were not in fact being cleaned up upon either completion of the non-blocking request or freeing of the persistent request except for instances where the 'c' interface detected certain cases for the arguments (in particular user-defined data types). So all the additions to the fortran code to cleanup temporary arrays needed, for example, when default integer type does not map to 'c' int, was not actually doing anything in the general case. This PR also hides the way the auxiliary array info is associated with the request rather than having the current array of pointers in the nbc request exposed in many different places. This PR also fixes up some problems found with handling of some of the arrays within the f90/f08 code as well as suppressing some compiler warnings. Signed-off-by: Howard Pritchard <howardp@lanl.gov>
9095f1f to
1ec756c
Compare
|
This LGTM, but I'm not deep into the knowledge of how the NBC stuff works. @bosilca can you review as well? |
|
I don't understand for what reason it stopped working because this entire mechanical was added in for exactly this purpose, helping non-blocking/persistent to release their temporary buggers. I need some time to dig in. |
Turns out that in the course of working on PR #13280 , it was discovered that auxiliary arrays associated with a non-blocking/persistent collective requests were not in fact being cleaned up upon either completion of the non-blocking request or freeing of the persistent request except for instances where the 'c' interface detected certain cases for the arguments (in particular user-defined data types).
So all the additions to the fortran code to cleanup temporary arrays needed, for example, when default integer type does not map to 'c' int, was not actually doing anything in the general case.
This PR also hides the way the auxiliary array info is associated with the request rather than having
the current array of pointers in the nbc request exposed in many different places.
This PR also fixes up some problems found with handling of some of the arrays within the f90/f08 code as well as suppressing some compiler warnings.