-
Notifications
You must be signed in to change notification settings - Fork 43
Rewriting annex - interoperability with other programming models #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
e1d5095
Rewrite the interoperability annex
minsii 34c7e86
Update dynamic process creation subsection
minsii afdc69a
Typo fix and minor word adjustment
minsii 43c5379
Add more details in RMA semantics subsection
minsii c055ca4
Made a pass by English editor
minsii f8ebcfc
Fix function format
minsii 63eef1d
Change query API to shmem_ and move text into separate file
minsii 325957a
Add example code for pe mapping
minsii 348d60b
Minor text adjustment
minsii c089a75
Simplified version of dynamic process and rma sections
minsii d754e7a
Do not mention interference in first paragraph
minsii b179132
interop/mpmd: strong advice to not use dynamic process with shmem
minsii 56db04b
interop/rma: simply ask user to avoid using both RMA models
minsii ab532fb
interop/progress: mention query api to connect paragraphs
minsii c57d957
interop/threads: add restriction for mixed thread levels
minsii 1c6b197
interop/id: use sync_all instead of barrier_all in example
minsii 7e1508d
interop/progress: minor text adjustment
minsii e062450
interop: move interoperability to a separate file
minsii 6c2e7b6
interop/dynamic: delete MPMD in section title
minsii 9cb51e1
interop/threads: adjust text based on f2f meeting feedback
minsii 683f423
interop/id: fix example
minsii a21c855
interop/rma: adjust text based on f2f meeting feedback
minsii 09a6b84
interop/query: delete note to implementors
minsii 0b801f5
interop/progress: adjust note to implementor
minsii f8eebf6
interop/query: shorten overview example
minsii b937c0a
interop/query: add example with MPI progress support
minsii c4da312
interop: made a pass by English editor
gpieper 192710f
interop/query: header fix and use larger data size in example
minsii ba758db
interop/dynamic: use MPI to communicate rather than disallow
minsii 4e28a44
interop/threads: adjust text
minsii d69cf7f
interop/query: fix compiling issue
minsii 310afa7
interop/id: replace PE "identifier" with "number" for consistency
minsii 7723dfa
interop: adjust to use teams API
minsii fc7a90d
interop/rma: only disable concurrent access to the same location
minsii 2569471
interop: minor text adjustment
minsii 3367f44
interop/rma: clarify one-sided op and undefined behavior
minsii 5608cd0
interop: avoid using "user", use program instead.
minsii 7849b7f
interop: delete query API.
minsii e3a5aea
interop: use \ac{PE}
minsii ec2ca5f
interop: use \ac{MPI}
minsii 9d1a322
interop: add reference to section 4.1 (progress definition)
minsii 9f11fd4
interop: new mapping id example based on comm_split
minsii 816b271
interop: adjust text
minsii File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,174 @@ | ||
| \chapter{Interoperability with Other Programming Models}\label{sec:interoperability} | ||
|
|
||
| OpenSHMEM routines may be used in conjunction with the routines of other | ||
| communication libraries or parallel languages in the same program. This section | ||
| describes the interoperability with other programming models, including | ||
| clarification of undefined behaviors caused by mixed use of different models, | ||
| and advice to \openshmem library users and developers that may improve the portability | ||
| and performance of hybrid programs. | ||
|
|
||
|
|
||
| \section{\ac{MPI} Interoperability} | ||
|
|
||
| \openshmem and \ac{MPI} are two commonly used parallel programming models for | ||
| distributed-memory systems. The user can choose to utilize both models in the same program | ||
| to efficiently and easily support various communication patterns. | ||
|
|
||
| A vendor may implement the \openshmem and \ac{MPI} libraries in different ways. For | ||
| instance, one may implement both \openshmem and \ac{MPI} as standalone libraries, | ||
| each of which allocates and initializes fully isolated communication | ||
| resources. | ||
| Another approach | ||
| is to implement both \openshmem and \ac{MPI} interfaces within the | ||
| same software system in order to share a communication resource when possible. | ||
|
|
||
| To improve interoperability and portability in \openshmem + \ac{MPI} hybrid | ||
| programming, we clarify the relevant semantics in the following subsections. | ||
|
|
||
|
|
||
| \subsection{Initialization} | ||
| In order to ensure that a hybrid program can be portably performed with different vendor | ||
| implementations, the \openshmem environment of the program must be initialized by | ||
| a call to \FUNC{shmem\_init} or \FUNC{shmem\_init\_thread} and be finalized by | ||
| a call to \FUNC{shmem\_finalize}; the \ac{MPI} environment of the program must be initialized | ||
| by a call to \FUNC{MPI\_Init} or \FUNC{MPI\_Init\_thread} and be finalized by a | ||
| call to \FUNC{MPI\_Finalize}. | ||
|
|
||
| \apiimpnotes{ | ||
| Portable implementations of OpenSHMEM and \ac{MPI} must ensure that the initialization | ||
| calls can be made in an arbitrary order within a program; the same rule also | ||
| applies to the finalization calls. A software runtime that utilizes a shared | ||
| communication resource for \openshmem and \ac{MPI} communication may maintain an | ||
| internal reference counter in order to ensure that the shared resource is | ||
| initialized only once and thus no shared resource is released until the last | ||
| finalization call is made. | ||
| } | ||
|
|
||
|
|
||
| \subsection{Dynamic Process Creation} | ||
| \label{subsec:interoperability:mpmd} | ||
|
|
||
| \ac{MPI} defines a dynamic process model that allows creation of processes after | ||
| an \ac{MPI} application has started (e.g., by calling \FUNC{MPI\_Comm\_spawn}) and | ||
| connection to independent processes (e.g., through \FUNC{MPI\_Comm\_accept} | ||
| and \FUNC{MPI\_Comm\_connect}). | ||
| It provides a mechanism to establish communication | ||
| between the newly created processes and the existing \ac{MPI} application (see | ||
| \ac{MPI} standard version 3.1, Chapter 10). | ||
| Unlike \ac{MPI}, \openshmem starts all processes at once and requires all \acp{PE} to | ||
| collectively allocate and initialize resources (e.g., symmetric heap) used by | ||
| the \openshmem library before any other \openshmem routine may | ||
| be called. \openshmem does not support communication with dynamically created | ||
| or connected processes. In such a scenario, \ac{MPI} can be used to communicate | ||
| with these processes. | ||
|
|
||
|
|
||
| \subsection{Thread Safety} | ||
| \label{subsec:interoperability:thread} | ||
| Both \openshmem and \ac{MPI} define the interaction with user threads in a program | ||
| with routines that can be used for initializing and querying the thread | ||
| environment. A hybrid program may request different thread levels | ||
| at the initialization calls of \openshmem and \ac{MPI} environments; however, the | ||
| returned support level provided by the \openshmem or \ac{MPI} library might be different | ||
| from that returned in an non-hybrid program. For instance, the former | ||
| initialization call in a hybrid program may initialize a resource with the | ||
| requested thread level, but the supported level cannot be updated by a subsequent | ||
| initialization call if the underlying software runtime of \openshmem and \ac{MPI} | ||
| share the same internal communication resource. | ||
| The program should always check the \VAR{provided} thread level returned | ||
| at the corresponding initialization call or query the level of thread support | ||
| after initialization to portably ensure thread support in each communication | ||
| environment. | ||
|
|
||
| Both \openshmem and \ac{MPI} define similar thread levels, namely, \VAR{THREAD\_SINGLE}, | ||
| \VAR{THREAD\_FUNNELED}, \VAR{THREAD\_SERIALIZED}, and \VAR{THREAD\_MULTIPLE}. | ||
| When requesting threading support in a hybrid program, however, | ||
| the following additional rules are applied if the implementations of \openshmem | ||
| and \ac{MPI} share the same internal communication resource. | ||
| It is strongly recommended to always follow these rules to ensure program | ||
| portability. | ||
|
|
||
| \begin{itemize} | ||
| \item The \VAR{THREAD\_SINGLE} thread level requires a single-threaded program. | ||
| Hence, a hybrid program should not request \VAR{THREAD\_SINGLE} at the initialization | ||
| call of either \openshmem or \ac{MPI} but request a different thread level at the | ||
| initialization call of the other model. | ||
|
|
||
| \item The \VAR{THREAD\_FUNNELED} thread level allows only the main thread to | ||
| make communication calls. A hybrid program using the \VAR{THREAD\_FUNNELED} | ||
| thread level in both \openshmem and \ac{MPI} should ensure that the same main thread | ||
| is used in both communication environments. | ||
|
|
||
| \item The \VAR{THREAD\_SERIALIZED} thread level requires the program to ensure | ||
| that communication calls are not made concurrently by multiple threads. If a | ||
| hybrid program uses \VAR{THREAD\_SERIALIZED} in one communication environment | ||
| and \VAR{THREAD\_SERIALIZED} or \VAR{THREAD\_FUNNELED} in the other one, it | ||
| should also guarantee that the \openshmem and \ac{MPI} calls are not made concurrently | ||
| from two distinct threads. | ||
| \end{itemize} | ||
|
|
||
| \subsection{Mapping Process Identification Numbers} | ||
| \label{subsec:interoperability:id} | ||
|
|
||
| Similar to the \ac{PE} number in \openshmem, \ac{MPI} defines rank as the | ||
| identification number of a process in a communicator. Both the \openshmem \ac{PE} | ||
| and the \ac{MPI} rank are unique integers assigned from zero to one less than the total | ||
| number of processes. In a hybrid program, the \openshmem | ||
| \ac{PE} number in \LibHandleRef{SHMEM\_TEAM\_WORLD} | ||
| and the \ac{MPI} rank in \VAR{MPI\_COMM\_WORLD} of a process can be equal. | ||
| This feature, however, may be provided by only some of the \openshmem and \ac{MPI} | ||
| implementations (e.g., if both environments share the same underlying process | ||
| manager) and is not portably guaranteed. A portable program should always | ||
| use the standard functions in each model, namely, \FUNC{shmem\_team\_my\_pe} in \openshmem | ||
| and \FUNC{MPI\_Comm\_rank} in \ac{MPI}, to query the process identification numbers | ||
| in each communication environment and manage the mapping of identifiers in the | ||
| program when necessary. | ||
minsii marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| \subsubsection*{Examples} | ||
| \label{subsubsec:interoperability:id:example} | ||
| The following example demonstrates how to manage the mapping between \openshmem | ||
| \ac{PE} numbers and \ac{MPI} ranks in \VAR{MPI\_COMM\_WORLD} in a hybrid \openshmem | ||
| and \ac{MPI} program. | ||
|
|
||
| \lstinputlisting[language={C}, tabsize=2, | ||
| basicstyle=\ttfamily\footnotesize] | ||
| {example_code/hybrid_mpi_mapping_id.c} | ||
|
|
||
| The following example demonstrates an alternative approach for managing the mapping | ||
| of process identification numbers in a hybrid program. The program creates a | ||
| new MPI communicator, named \VAR{shmem\_comm}, that contains all | ||
| processes in \VAR{MPI\_COMM\_WORLD} and each process has the same \ac{MPI} rank | ||
| number as its \openshmem \ac{PE} number. | ||
|
|
||
| \lstinputlisting[language={C}, tabsize=2, | ||
| basicstyle=\ttfamily\footnotesize] | ||
| {example_code/hybrid_mpi_mapping_id_shmem_comm.c} | ||
|
|
||
| \subsection{RMA Programming Models} | ||
| \label{subsec:interoperability:rma} | ||
|
|
||
| \openshmem and \ac{MPI} each define similar one-sided communication models; | ||
| however, a portable program should not assume interoperability between these | ||
| models. | ||
| For instance, \openshmem guarantees the atomicity only of concurrent \openshmem AMO operations | ||
| that operate on symmetric data with the same datatype. Access to the same symmetric | ||
| object with \ac{MPI} atomic operations, such as an \FUNC{MPI\_Fetch\_and\_op}, may | ||
| result in an undefined result. A hybrid program should avoid situations where \ac{MPI} and | ||
| \openshmem one-sided operations perform concurrent accesses to the same memory | ||
| location; otherwise, the behavior is undefined. | ||
|
|
||
| \subsection{Communication Progress} | ||
| \label{subsec:interoperability:progress} | ||
|
|
||
| \openshmem promises the progression of communication both with and without | ||
| \openshmem calls and requires the software progress mechanism in the implementation | ||
| (e.g., a progress thread) when the hardware does not provide asynchronous communication | ||
| capabilities (see Section \ref{subsec:progress}). | ||
| In \ac{MPI}, however, a weak progress semantics is applied. That is, | ||
| an \ac{MPI} communication call is guaranteed only to complete in finite time. For | ||
minsii marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| instance, an \FUNC{MPI\_Put} may be completed only when the remote process makes an \ac{MPI} | ||
| call that internally triggers the progress of \ac{MPI}, if the underlying hardware | ||
| does not support asynchronous communication. A hybrid program | ||
| should not assume that the \openshmem library also makes progress for \ac{MPI}. | ||
| It can explicitly manage the asynchronous communication of \ac{MPI} in | ||
| order to prevent any deadlock or performance degradation. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| #include <stdio.h> | ||
| #include <shmem.h> | ||
| #include <mpi.h> | ||
|
|
||
| int main(int argc, char *argv[]) | ||
| { | ||
| MPI_Init(&argc, &argv); | ||
| shmem_init(); | ||
|
|
||
| int mype = shmem_team_my_pe(SHMEM_TEAM_WORLD); | ||
| int npes = shmem_team_n_pes(SHMEM_TEAM_WORLD); | ||
|
|
||
| static int myrank; | ||
| MPI_Comm_rank(MPI_COMM_WORLD, &myrank); | ||
|
|
||
| int *mpi_ranks = shmem_calloc(npes, sizeof(int)); | ||
|
|
||
| shmem_int_collect(SHMEM_TEAM_WORLD, mpi_ranks, &myrank, 1); | ||
| if (mype == 0) | ||
| for (int i = 0; i < npes; i++) | ||
| printf("PE %d's MPI rank is %d\n", i, mpi_ranks[i]); | ||
|
|
||
| shmem_free(mpi_ranks); | ||
|
|
||
| shmem_finalize(); | ||
| MPI_Finalize(); | ||
|
|
||
| return 0; | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| #include <stdio.h> | ||
| #include <shmem.h> | ||
| #include <mpi.h> | ||
|
|
||
| int main(int argc, char *argv[]) | ||
| { | ||
| MPI_Init(&argc, &argv); | ||
| shmem_init(); | ||
|
|
||
| int mype = shmem_my_pe(); | ||
|
|
||
| MPI_Comm shmem_comm; | ||
| MPI_Comm_split(MPI_COMM_WORLD, 0, mype, &shmem_comm); | ||
|
|
||
| int myrank; | ||
| MPI_Comm_rank(shmem_comm, &myrank); | ||
| printf("PE %d's MPI rank is %d\n", mype, myrank); | ||
|
|
||
| MPI_Comm_free(&shmem_comm); | ||
| shmem_finalize(); | ||
| MPI_Finalize(); | ||
|
|
||
| return 0; | ||
| } |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.