Skip to content

Conversation

@BradWhitlock
Copy link
Member

@BradWhitlock BradWhitlock commented Sep 6, 2025

This PR resolves #1058 by migrating direct MPI3 calls to suitable Umpire shared memory replacements. These changes are all in quest and are part of the signed_distance functionality.

  • Fixed race conditions in quest_signed_distance_interface test that happened with running with multiple MPI ranks
  • Removed a constraint in the CMake build logic that limited that test to 1 rank, now it should work on 4+ and multiple nodes
  • Verified that these shared memory changes work on multiple compute nodes. The file is read on 1 rank, shared memory is allocated in 1 rank per node and broadcast to those buffers. Finally, all ranks use the shared memory to create a mint mesh.
  • Updated release notes.
  • Submitted a PR to Umpire to fix a memory leak. (merged)
  • Changed Axom package so it builds Umpire with MPI3 shared memory support when +mpi is given (tpls and host-configs are not updated)

…ionality so it uses Umpire shared memory allocators.
…nce_interface when run on multiple ranks. Remove build logic to limit the test to 1 rank.
@BradWhitlock BradWhitlock marked this pull request as draft September 6, 2025 01:09
@BradWhitlock BradWhitlock marked this pull request as ready for review September 9, 2025 00:08
Copy link
Member

@kennyweiss kennyweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BradWhitlock

I had some questions/suggestions about config variables that I'd like to work through before merging, and also possibly about promoting the shared resource allocator logic to core instead of quest, but otherwise this looks great!

Thanks for fixing the umpire memory leak and the issue w/ scaling to more than one rank!

depends_on("[email protected]", when="@0.6.0")
depends_on("umpire@5:5.0.1", when="@:0.5.0")
depends_on("umpire+openmp", when="+openmp")
depends_on("umpire+mpi3_shmem", when="+mpi")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is mpi3_shmem always available when mpi is available?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its something like this:

depends_on("umpire+mpi3_shmem", when="+mpi ^mpi@3:")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines -145 to -154
if (${test_name} STREQUAL "quest_signed_distance_interface")
set(_numMPITasks 1)
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +229 to 232
MPI_Comm intra_node_comm = MPI_COMM_NULL;
MPI_Comm inter_node_comm = MPI_COMM_NULL;
create_communicators(global_comm,
intra_node_comm,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to create extra communicators w/ the new allocator? Or (hopefully) does the new Umpire shared memory allocator allow us to simplify this?

Copy link
Member

@kennyweiss kennyweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @BradWhitlock!

My previous reservations related to the build system flags. I think the new #defines are a lot cleaner.

It'd be nice to centralize the shared memory allocation logic to memory_management.*pp in a follow-up, but I don't think it's required for this PR.

Comment on lines -47 to -49
cmake_dependent_option(AXOM_ENABLE_MPI3 "Enables use of MPI-3 features" OFF "ENABLE_MPI" OFF)
mark_as_advanced(AXOM_ENABLE_MPI3)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 61 to 78
# Check whether the Umpire defines symbols for shared memory
blt_check_code_compiles(CODE_COMPILES UMPIRE_SHARED_MEMORY
VERBOSE_OUTPUT OFF
DEPENDS_ON umpire
SOURCE_STRING "
#include <umpire/config.hpp>
#if defined(UMPIRE_ENABLE_IPC_SHARED_MEMORY) || defined(UMPIRE_ENABLE_MPI3_SHARED_MEMORY)
int main() { return 0; }
#else
#error Macros not defined
#endif
")
if (AXOM_ENABLE_MPI AND UMPIRE_SHARED_MEMORY)
set(AXOM_USE_UMPIRE_SHARED_MEMORY TRUE)
else()
set(AXOM_USE_UMPIRE_SHARED_MEMORY FALSE)
endif()
message(STATUS "Umpire supports shared memory: ${AXOM_USE_UMPIRE_SHARED_MEMORY}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 -- interesting solution!
I thought it might involve checking properties of the umpire target.

@kennyweiss kennyweiss added enhancement New feature or request TPL Issues related to Axom's third party libraries Build system Issues related to Axom's build system memory labels Sep 9, 2025
Copy link
Member

@kennyweiss kennyweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BradWhitlock -- even better!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test!

{
int allocator_id = INVALID_ALLOCATOR_ID;
#if defined(AXOM_USE_UMPIRE_SHARED_MEMORY)
const std::string allocatorName("axom_named_allocator");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to call this "axom_shared_allocator" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, looking a few lines down, should it be "axom_node_allocator"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Umpire examples, they first make a "node allocator" but then wrap it as a named allocator. From what I gather from their docs, this is to make the allocator compatible with their IPC shared memory logic. I renamed it to "axom_shared_allocator".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice functionality @BradWhitlock !

Copy link
Contributor

@gunney1 gunney1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @BradWhitlock

@kab163
Copy link

kab163 commented Oct 7, 2025

I just want to add a note of caution. Right now, it can be confusing if Umpire's IPC Shared Memory and/or Umpire's MPI3 Shared Memory are turned on since they can be both supported. If IPC is explicitly enabled but MPI3 is not, then MPI3 is turned off and IPC is the default shared memory resource. If MPI3 is explicitly enabled but IPC is not, then MPI3 is the default shared memory resource, but IPC is still turned on (will need to explicitly turn off if you don't want it). If IPC is explicitly enabled and MPI3 is explicitly enabled, then MPI3 is the default shared memory resource. (This all assumes MPI itself is enabled.)

The "default shared memory resource" is what is used when you only specify "SHARED".

To avoid confusion, I would highly recommend using auto traits = umpire::get_default_resource_traits("SHARED::MPI3"); and then auto mpi3_shm_allocator = rm.makeResource("SHARED::MPI3::axom_alloc", traits); (keeping mpi3 in the name) or instead auto traits = umpire::get_default_resource_traits("SHARED::POSIX"); and ...("SHARED::POSIX::axom_alloc", traits); keeping ipc or posix in the name... And/or, you could use a cmake message to tell you what the value of UMPIRE_DEFAULT_SHARED_MEMORY_RESOURCE ... couldn't hurt.

@kab163
Copy link

kab163 commented Oct 7, 2025

@kab163
Copy link

kab163 commented Oct 7, 2025

Specifically, I discovered today that our recipe_naming_shim example is broken. When I enable IPC and MPI3 shared memory, it keeps trying to use MPI3 even if I do auto traits = umpire::get_default_resource_traits("SHARED::POSIX");. The key is to also do auto node_allocator{rm.makeResource("SHARED::POSIX::node_allocator", traits)}; with emphasis on the SHARED::POSIX prefix.

So again, I encourage caution here and to be as explicit as possible.

@BradWhitlock BradWhitlock merged commit 3c96e95 into develop Oct 22, 2025
15 checks passed
@BradWhitlock BradWhitlock deleted the feature/whitlock/use_umpire_shared_memory branch October 22, 2025 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Build system Issues related to Axom's build system enhancement New feature or request memory Reviewed TPL Issues related to Axom's third party libraries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate shared memory capability to Umpire

7 participants