Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
e1d5095
Rewrite the interoperability annex
minsii Jan 29, 2019
34c7e86
Update dynamic process creation subsection
minsii Apr 1, 2019
afdc69a
Typo fix and minor word adjustment
minsii Apr 1, 2019
43c5379
Add more details in RMA semantics subsection
minsii Apr 2, 2019
c055ca4
Made a pass by English editor
minsii May 20, 2019
f8ebcfc
Fix function format
minsii Sep 9, 2019
63eef1d
Change query API to shmem_ and move text into separate file
minsii Sep 9, 2019
325957a
Add example code for pe mapping
minsii Sep 10, 2019
348d60b
Minor text adjustment
minsii Sep 10, 2019
c089a75
Simplified version of dynamic process and rma sections
minsii Sep 10, 2019
d754e7a
Do not mention interference in first paragraph
minsii Sep 12, 2019
b179132
interop/mpmd: strong advice to not use dynamic process with shmem
minsii Sep 12, 2019
56db04b
interop/rma: simply ask user to avoid using both RMA models
minsii Sep 25, 2019
ab532fb
interop/progress: mention query api to connect paragraphs
minsii Sep 25, 2019
c57d957
interop/threads: add restriction for mixed thread levels
minsii Sep 25, 2019
1c6b197
interop/id: use sync_all instead of barrier_all in example
minsii Sep 25, 2019
7e1508d
interop/progress: minor text adjustment
minsii Sep 25, 2019
e062450
interop: move interoperability to a separate file
minsii Oct 21, 2019
6c2e7b6
interop/dynamic: delete MPMD in section title
minsii Oct 21, 2019
9cb51e1
interop/threads: adjust text based on f2f meeting feedback
minsii Oct 21, 2019
683f423
interop/id: fix example
minsii Oct 21, 2019
a21c855
interop/rma: adjust text based on f2f meeting feedback
minsii Oct 21, 2019
09a6b84
interop/query: delete note to implementors
minsii Oct 21, 2019
0b801f5
interop/progress: adjust note to implementor
minsii Oct 22, 2019
f8eebf6
interop/query: shorten overview example
minsii Oct 22, 2019
b937c0a
interop/query: add example with MPI progress support
minsii Oct 22, 2019
c4da312
interop: made a pass by English editor
gpieper Oct 22, 2019
192710f
interop/query: header fix and use larger data size in example
minsii Oct 28, 2019
ba758db
interop/dynamic: use MPI to communicate rather than disallow
minsii Oct 28, 2019
4e28a44
interop/threads: adjust text
minsii Oct 28, 2019
d69cf7f
interop/query: fix compiling issue
minsii Oct 28, 2019
310afa7
interop/id: replace PE "identifier" with "number" for consistency
minsii Oct 28, 2019
7723dfa
interop: adjust to use teams API
minsii Oct 28, 2019
fc7a90d
interop/rma: only disable concurrent access to the same location
minsii Oct 28, 2019
2569471
interop: minor text adjustment
minsii Oct 29, 2019
3367f44
interop/rma: clarify one-sided op and undefined behavior
minsii Oct 29, 2019
5608cd0
interop: avoid using "user", use program instead.
minsii Oct 29, 2019
7849b7f
interop: delete query API.
minsii Oct 29, 2019
e3a5aea
interop: use \ac{PE}
minsii Oct 29, 2019
ec2ca5f
interop: use \ac{MPI}
minsii Oct 29, 2019
9d1a322
interop: add reference to section 4.1 (progress definition)
minsii Dec 6, 2019
9f11fd4
interop: new mapping id example based on comm_split
minsii Dec 6, 2019
816b271
interop: adjust text
minsii Dec 9, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion content/backmatter.tex
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,9 @@ \chapter{Undefined Behavior in OpenSHMEM}\label{sec:undefined}
\end{longtable}



\color{ForestGreen}
\input{content/interoperability}
\color{black}

\chapter{History of OpenSHMEM}\label{sec:openshmem_history}

Expand Down
174 changes: 174 additions & 0 deletions content/interoperability.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
\chapter{Interoperability with Other Programming Models}\label{sec:interoperability}

OpenSHMEM routines may be used in conjunction with the routines of other
communication libraries or parallel languages in the same program. This section
describes the interoperability with other programming models, including
clarification of undefined behaviors caused by mixed use of different models,
and advice to \openshmem library users and developers that may improve the portability
and performance of hybrid programs.


\section{\ac{MPI} Interoperability}

\openshmem and \ac{MPI} are two commonly used parallel programming models for
distributed-memory systems. The user can choose to utilize both models in the same program
to efficiently and easily support various communication patterns.

A vendor may implement the \openshmem and \ac{MPI} libraries in different ways. For
instance, one may implement both \openshmem and \ac{MPI} as standalone libraries,
each of which allocates and initializes fully isolated communication
resources.
Another approach
is to implement both \openshmem and \ac{MPI} interfaces within the
same software system in order to share a communication resource when possible.

To improve interoperability and portability in \openshmem + \ac{MPI} hybrid
programming, we clarify the relevant semantics in the following subsections.


\subsection{Initialization}
In order to ensure that a hybrid program can be portably performed with different vendor
implementations, the \openshmem environment of the program must be initialized by
a call to \FUNC{shmem\_init} or \FUNC{shmem\_init\_thread} and be finalized by
a call to \FUNC{shmem\_finalize}; the \ac{MPI} environment of the program must be initialized
by a call to \FUNC{MPI\_Init} or \FUNC{MPI\_Init\_thread} and be finalized by a
call to \FUNC{MPI\_Finalize}.

\apiimpnotes{
Portable implementations of OpenSHMEM and \ac{MPI} must ensure that the initialization
calls can be made in an arbitrary order within a program; the same rule also
applies to the finalization calls. A software runtime that utilizes a shared
communication resource for \openshmem and \ac{MPI} communication may maintain an
internal reference counter in order to ensure that the shared resource is
initialized only once and thus no shared resource is released until the last
finalization call is made.
}


\subsection{Dynamic Process Creation}
\label{subsec:interoperability:mpmd}

\ac{MPI} defines a dynamic process model that allows creation of processes after
an \ac{MPI} application has started (e.g., by calling \FUNC{MPI\_Comm\_spawn}) and
connection to independent processes (e.g., through \FUNC{MPI\_Comm\_accept}
and \FUNC{MPI\_Comm\_connect}).
It provides a mechanism to establish communication
between the newly created processes and the existing \ac{MPI} application (see
\ac{MPI} standard version 3.1, Chapter 10).
Unlike \ac{MPI}, \openshmem starts all processes at once and requires all \acp{PE} to
collectively allocate and initialize resources (e.g., symmetric heap) used by
the \openshmem library before any other \openshmem routine may
be called. \openshmem does not support communication with dynamically created
or connected processes. In such a scenario, \ac{MPI} can be used to communicate
with these processes.


\subsection{Thread Safety}
\label{subsec:interoperability:thread}
Both \openshmem and \ac{MPI} define the interaction with user threads in a program
with routines that can be used for initializing and querying the thread
environment. A hybrid program may request different thread levels
at the initialization calls of \openshmem and \ac{MPI} environments; however, the
returned support level provided by the \openshmem or \ac{MPI} library might be different
from that returned in an non-hybrid program. For instance, the former
initialization call in a hybrid program may initialize a resource with the
requested thread level, but the supported level cannot be updated by a subsequent
initialization call if the underlying software runtime of \openshmem and \ac{MPI}
share the same internal communication resource.
The program should always check the \VAR{provided} thread level returned
at the corresponding initialization call or query the level of thread support
after initialization to portably ensure thread support in each communication
environment.

Both \openshmem and \ac{MPI} define similar thread levels, namely, \VAR{THREAD\_SINGLE},
\VAR{THREAD\_FUNNELED}, \VAR{THREAD\_SERIALIZED}, and \VAR{THREAD\_MULTIPLE}.
When requesting threading support in a hybrid program, however,
the following additional rules are applied if the implementations of \openshmem
and \ac{MPI} share the same internal communication resource.
It is strongly recommended to always follow these rules to ensure program
portability.

\begin{itemize}
\item The \VAR{THREAD\_SINGLE} thread level requires a single-threaded program.
Hence, a hybrid program should not request \VAR{THREAD\_SINGLE} at the initialization
call of either \openshmem or \ac{MPI} but request a different thread level at the
initialization call of the other model.

\item The \VAR{THREAD\_FUNNELED} thread level allows only the main thread to
make communication calls. A hybrid program using the \VAR{THREAD\_FUNNELED}
thread level in both \openshmem and \ac{MPI} should ensure that the same main thread
is used in both communication environments.

\item The \VAR{THREAD\_SERIALIZED} thread level requires the program to ensure
that communication calls are not made concurrently by multiple threads. If a
hybrid program uses \VAR{THREAD\_SERIALIZED} in one communication environment
and \VAR{THREAD\_SERIALIZED} or \VAR{THREAD\_FUNNELED} in the other one, it
should also guarantee that the \openshmem and \ac{MPI} calls are not made concurrently
from two distinct threads.
\end{itemize}

\subsection{Mapping Process Identification Numbers}
\label{subsec:interoperability:id}

Similar to the \ac{PE} number in \openshmem, \ac{MPI} defines rank as the
identification number of a process in a communicator. Both the \openshmem \ac{PE}
and the \ac{MPI} rank are unique integers assigned from zero to one less than the total
number of processes. In a hybrid program, the \openshmem
\ac{PE} number in \LibHandleRef{SHMEM\_TEAM\_WORLD}
and the \ac{MPI} rank in \VAR{MPI\_COMM\_WORLD} of a process can be equal.
This feature, however, may be provided by only some of the \openshmem and \ac{MPI}
implementations (e.g., if both environments share the same underlying process
manager) and is not portably guaranteed. A portable program should always
use the standard functions in each model, namely, \FUNC{shmem\_team\_my\_pe} in \openshmem
and \FUNC{MPI\_Comm\_rank} in \ac{MPI}, to query the process identification numbers
in each communication environment and manage the mapping of identifiers in the
program when necessary.

\subsubsection*{Examples}
\label{subsubsec:interoperability:id:example}
The following example demonstrates how to manage the mapping between \openshmem
\ac{PE} numbers and \ac{MPI} ranks in \VAR{MPI\_COMM\_WORLD} in a hybrid \openshmem
and \ac{MPI} program.

\lstinputlisting[language={C}, tabsize=2,
basicstyle=\ttfamily\footnotesize]
{example_code/hybrid_mpi_mapping_id.c}

The following example demonstrates an alternative approach for managing the mapping
of process identification numbers in a hybrid program. The program creates a
new MPI communicator, named \VAR{shmem\_comm}, that contains all
processes in \VAR{MPI\_COMM\_WORLD} and each process has the same \ac{MPI} rank
number as its \openshmem \ac{PE} number.

\lstinputlisting[language={C}, tabsize=2,
basicstyle=\ttfamily\footnotesize]
{example_code/hybrid_mpi_mapping_id_shmem_comm.c}

\subsection{RMA Programming Models}
\label{subsec:interoperability:rma}

\openshmem and \ac{MPI} each define similar one-sided communication models;
however, a portable program should not assume interoperability between these
models.
For instance, \openshmem guarantees the atomicity only of concurrent \openshmem AMO operations
that operate on symmetric data with the same datatype. Access to the same symmetric
object with \ac{MPI} atomic operations, such as an \FUNC{MPI\_Fetch\_and\_op}, may
result in an undefined result. A hybrid program should avoid situations where \ac{MPI} and
\openshmem one-sided operations perform concurrent accesses to the same memory
location; otherwise, the behavior is undefined.

\subsection{Communication Progress}
\label{subsec:interoperability:progress}

\openshmem promises the progression of communication both with and without
\openshmem calls and requires the software progress mechanism in the implementation
(e.g., a progress thread) when the hardware does not provide asynchronous communication
capabilities (see Section \ref{subsec:progress}).
In \ac{MPI}, however, a weak progress semantics is applied. That is,
an \ac{MPI} communication call is guaranteed only to complete in finite time. For
instance, an \FUNC{MPI\_Put} may be completed only when the remote process makes an \ac{MPI}
call that internally triggers the progress of \ac{MPI}, if the underlying hardware
does not support asynchronous communication. A hybrid program
should not assume that the \openshmem library also makes progress for \ac{MPI}.
It can explicitly manage the asynchronous communication of \ac{MPI} in
order to prevent any deadlock or performance degradation.
29 changes: 29 additions & 0 deletions example_code/hybrid_mpi_mapping_id.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#include <stdio.h>
#include <shmem.h>
#include <mpi.h>

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
shmem_init();

int mype = shmem_team_my_pe(SHMEM_TEAM_WORLD);
int npes = shmem_team_n_pes(SHMEM_TEAM_WORLD);

static int myrank;
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);

int *mpi_ranks = shmem_calloc(npes, sizeof(int));

shmem_int_collect(SHMEM_TEAM_WORLD, mpi_ranks, &myrank, 1);
if (mype == 0)
for (int i = 0; i < npes; i++)
printf("PE %d's MPI rank is %d\n", i, mpi_ranks[i]);

shmem_free(mpi_ranks);

shmem_finalize();
MPI_Finalize();

return 0;
}
24 changes: 24 additions & 0 deletions example_code/hybrid_mpi_mapping_id_shmem_comm.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#include <stdio.h>
#include <shmem.h>
#include <mpi.h>

int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
shmem_init();

int mype = shmem_my_pe();

MPI_Comm shmem_comm;
MPI_Comm_split(MPI_COMM_WORLD, 0, mype, &shmem_comm);

int myrank;
MPI_Comm_rank(shmem_comm, &myrank);
printf("PE %d's MPI rank is %d\n", mype, myrank);

MPI_Comm_free(&shmem_comm);
shmem_finalize();
MPI_Finalize();

return 0;
}