Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,7 @@ endif ()
#------------------------------------------------------------------------------
# Build options
#------------------------------------------------------------------------------
# option(ENABLE_DYAD_DEBUG "Include debugging prints and logging" OFF) # This is not needed as we have CMAKE_BUILD_TYPE
# This is verbose, maybe an alternate might help simplify
#option(BUILD_URPC "Build DYAD's URPC code" OFF)
#option(ENABLE_PERFFLOW "Build with PerfFlow Aspect support" OFF)
#option(ENABLE_UCX_DTL "Build DYAD's UCX data transport layer" OFF)

Expand Down
27 changes: 21 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,26 @@
DYAD: DYnamic and Asynchronous Data Streamliner

DYAD aims to help sharing data files between producer and consumer job elements,
especially within an ensemble or between co-scheduled ensembles.
DYAD provides the service by two components: a FLUX module and a I/O wraper set.
DYAD transparently synchronizes file I/O between producer and consumer, and
transfers data from the producer location to the consumer location managed by the service.
Users only need to use the file path that is under the directory managed by the service.
DYAD aims to facilitate data file sharing between producer and consumer job elements, particularly within an ensemble or across co-scheduled ensembles.

DYAD delivers this functionality through two components: a FLUX module that provides the service and a set of I/O wrappers for client-side integration.

DYAD transparently synchronizes file access at the file level (rather than the byte level) between producers and consumers, and manages data transfer from the producer’s location to the consumer’s location.

Users simply access files via paths located under the directory managed by the DYAD service.

### Documentation
For further information, build and refer to the documentation under `docs`

```
cd docs
python3 -m venv .venv
source .venv/bin/activate
pip install "Sphinx<7.0.0" myst-parser rst2pdf
make html
make pdf
```
Then, open `index.html` under `_build/html` or DYAD.pdf under `_build/pdf`


### License

Expand Down
4 changes: 2 additions & 2 deletions cmake/modules/SetupCompiler.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -96,11 +96,11 @@ endmacro()

dyad_add_cxx_flags(CMAKE_CXX_FLAGS
-Wall -Wextra -pedantic -Wno-unused-parameter -Wnon-virtual-dtor
-Wno-deprecated-declarations)
-Wno-deprecated-declarations -Wno-nonnull-compare)

dyad_add_c_flags(CMAKE_C_FLAGS
-Wall -Wextra -pedantic -Wno-unused-parameter
-Wno-deprecated-declarations)
-Wno-deprecated-declarations -Wno-nonnull-compare)

if (${GLIBC_VERSION} VERSION_GREATER_EQUAL "2.19")
# to suppress usleep() warning
Expand Down
11 changes: 11 additions & 0 deletions docs/SCA-HPCAsia26_tutorial.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
**************************************************************************
SCA/HPCAsia 2026 · The SupercomputingAsia (SCA) Tutorial: January 26, 2026
**************************************************************************


.. toctree::
:maxdepth: 1

demos/SCA26/instruction

Material for the DYAD tutorial at SCA26 can be found under ``docs/demos/SCA26``.
Binary file added docs/_static/Paper_2024_SBACPAD_DYAD.pdf
Binary file not shown.
8 changes: 8 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,16 @@
# ones.
extensions = [
"sphinx.ext.autosectionlabel",
'myst_parser',
'rst2pdf.pdfbuilder',
]

# This line explicitly tells Sphinx which parser to use for each extension
source_suffix = {
'.rst': 'restructuredtext',
'.md': 'markdown',
}

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

Expand Down
96 changes: 96 additions & 0 deletions docs/debugging.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
====================================
Common Tips for Debugging with DYAD
====================================

Debugging distributed operations of mutiple jobs coordination under batch system is quite challending. Here are several tips.


Build DYAD for Debugging
========================

To facilitate debugging, DYAD provides several CMake options that can be enabled
at build time.

- **For users:** Enable DYAD logging support:

::

-DDYAD_LOGGER=FLUX|CPP_LOGGER -DDYAD_LOGGER_LEVEL=Debug

- **For developers:** Treat all compiler warnings as errors:

::

-DDYAD_WARNINGS_AS_ERRORS=ON

- **For developers:** Use Clang with AddressSanitizer as needed:

::

-DCMAKE_C_COMPILER=clang
-DCMAKE_CXX_COMPILER=clang++
-DCMAKE_BUILD_TYPE=Debug


Runtime Logging
===============

Enable Flux logging when starting an instance to capture DYAD logs:

::

flux start -v -o,-S,log-filename=out.txt


Controlling Job Standard I/O
============================

Flux job-related options can be used to control standard I/O behavior (see
`flux-run <https://flux-framework.readthedocs.io/projects/flux-core/en/latest/man1/flux-run.html>`_):

- Disable output buffering:

::

-u, --unbuffered

- Label output by rank:

::

-l, --label-io

- Redirect job output streams:

::

--output=, --error=, --log=, --log-stderr=

- Use
`mustache templates <https://flux-framework.readthedocs.io/projects/flux-core/en/latest/man1/flux-submit.html#mustache-templates>`_
for fine-controlling output.


Simulated Multi-Node Debugging
==============================

Use a single node with a simulated multi-node setup via
``flux start --test-size=N``. In this configuration, DYAD should use different
managed paths to mimic operations on distinct nodes.


Common Debugging Steps
======================

When isolating errors in DYAD-enabled applications, the following steps are
recommended:

- Verify environment variable propagation by running a script that prints all
DYAD-related environment variables in place of a DYAD job.
- Ensure environment variables are set consistently between producers and consumers.
- Confirm that ``DYAD_KVS_NAMESPACE`` is set and that the namespace exists in the KVS. ``flux kvs namespace list``
- Clear any namespaces or files left over from previous runs.
- Inspect logging output to identify where a DYAD consumer may be hanging or where
a DYAD job may have crashed.
- Inspect `KVS <https://flux-framework.readthedocs.io/projects/flux-core/en/latest/man1/flux-kvs.html>`_ entries at both the producer and consumer as needed. ``flux kvs dir -N ${DYAD_KVS_NAMESPACE} [key]``

Loading