Skip to content

Conversation

@hppritcha
Copy link
Member

Turns out that in the course of working on PR #13280 , it was discovered that auxiliary arrays associated with a non-blocking/persistent collective requests were not in fact being cleaned up upon either completion of the non-blocking request or freeing of the persistent request except for instances where the 'c' interface detected certain cases for the arguments (in particular user-defined data types).

So all the additions to the fortran code to cleanup temporary arrays needed, for example, when default integer type does not map to 'c' int, was not actually doing anything in the general case.

This PR also hides the way the auxiliary array info is associated with the request rather than having
the current array of pointers in the nbc request exposed in many different places.

This PR also fixes up some problems found with handling of some of the arrays within the f90/f08 code as well as suppressing some compiler warnings.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request reworks the memory management for auxiliary arrays in non-blocking and persistent collective MPI operations. The changes address a critical bug where temporary arrays created during Fortran-to-C conversions were not being properly cleaned up, except in limited cases detected by the C interface. The PR introduces new API functions to encapsulate array cleanup registration and hides the internal release_arrays structure, replacing direct array manipulation with function calls across all Fortran MPI bindings.

Key Changes

  • Introduced ompi_coll_base_append_array_to_release() and ompi_coll_base_run_release_arrays_cb() functions to manage array cleanup
  • Replaced direct manipulation of nb_request->data.release_arrays with the new API functions across mpif-h and use-mpi-f08 bindings
  • Fixed uninitialized variable (tmp_rdispls) and removed unused idx variables in several files

Reviewed changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
ompi/mca/coll/base/coll_base_util.h Adds new inline function ompi_coll_base_append_array_to_release() and declares ompi_coll_base_run_release_arrays_cb()
ompi/mca/coll/base/coll_base_util.c Implements ompi_coll_base_run_release_arrays_cb() to set callbacks for array cleanup, updates comment to reflect broader usage
ompi/mpi/fortran/use-mpi-f08/base/bigcount.h Updates macro to use new ompi_coll_base_append_array_to_release() API instead of direct array access
ompi/mpi/fortran/use-mpi-f08/*.c.in Multiple files updated to use new API functions for registering arrays, adds calls to ompi_coll_base_run_release_arrays_cb(), fixes variable initialization
ompi/mpi/fortran/mpif-h/*.c Multiple files updated to replace direct nb_request->data.release_arrays manipulation with new API calls, removes unused idx variables

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit to trying the copilot review just for the heck of it. I'm surprised that the (mostly good) quality of the copilot review! I'm also pleased that it correctly labeled all the "nitpick" issues (i.e., it clearly calls out when it's just being nitpicky).

@hppritcha hppritcha requested a review from jsquyres November 26, 2025 22:36
Turns out that in the course of working on PR open-mpi#13280 ,
it was discovered that auxiliary arrays associated with
a non-blocking/persistent collective requests were not
in fact being cleaned up upon either completion of the
non-blocking request or freeing of the persistent request
except for instances where the 'c' interface detected certain
cases for the arguments (in particular user-defined data types).

So all the additions to the fortran code to cleanup
temporary arrays needed, for example, when default
integer type does not map to 'c' int, was not actually
doing anything in the general case.

This PR also hides the way the auxiliary array info
is associated with the request rather than having
the current array of pointers in the nbc request exposed in many
different places.

This PR also fixes up some problems found with handling
of some of the arrays within the f90/f08 code as well as
suppressing some compiler warnings.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
@hppritcha hppritcha force-pushed the rework_non_blocking_array_args_cleanup branch from 9095f1f to 1ec756c Compare November 26, 2025 22:38
@jsquyres
Copy link
Member

This LGTM, but I'm not deep into the knowledge of how the NBC stuff works.

@bosilca can you review as well?

@bosilca
Copy link
Member

bosilca commented Nov 27, 2025

I don't understand for what reason it stopped working because this entire mechanical was added in for exactly this purpose, helping non-blocking/persistent to release their temporary buggers. I need some time to dig in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants