diff --git a/README.md b/README.md index 28c92bf7a..ed472d009 100644 --- a/README.md +++ b/README.md @@ -1,34 +1,37 @@ -# CAPIO +# CAPIO: Cross Application Programmable IO -CAPIO (Cross-Application Programmable I/O), is a middleware aimed at injecting streaming capabilities to workflow steps -without changing the application codebase. It has been proven to work with C/C++ binaries, Fortran Binaries, JAVA, -python and bash. +CAPIO is a middleware aimed at injecting streaming capabilities into workflow steps +without changing the application codebase. It has been proven to work with C/C++ binaries, Fortran, Java, Python, and +Bash. -[![codecov](https://codecov.io/gh/High-Performance-IO/capio/graph/badge.svg?token=6ATRB5VJO3)](https://codecov.io/gh/High-Performance-IO/capio) -![CI-Tests](https://github.com/High-Performance-IO/capio/actions/workflows/ci-tests.yaml/badge.svg) -[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://raw.githubusercontent.com/High-Performance-IO/capio/master/LICENSE) +[![codecov](https://codecov.io/gh/High-Performance-IO/capio/graph/badge.svg?token=6ATRB5VJO3)](https://codecov.io/gh/High-Performance-IO/capio) ![CI-Tests](https://github.com/High-Performance-IO/capio/actions/workflows/ci-tests.yaml/badge.svg) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://raw.githubusercontent.com/High-Performance-IO/capio/master/LICENSE) -> [!IMPORTANT] -> This version of CAPIO does not support writes to memory. -> If you need it please refer to releases/v1.0.0 +> [!TIP] +> CAPIO is now multibackend and dynamic by nature: you do not need MPI, to benefit for the in-memory IO improvements! +> Just use a MTCL provided backend, if you want the in-memory IO, or fall back to the file system backend (default) if +> oy just want to coordinate IO operations between workflow steps! -## Build and run tests + +--- + +## πŸ”§ Build and Run Tests ### Dependencies -CAPIO depends on the following software that needs to be manually installed: +**Required manually:** -- `cmake >=3.15` -- `c++17` or newer +- `cmake >= 3.15` +- `C++17` - `pthreads` -The following dependencies are automatically fetched during cmake configuration phase, and compiled when required. +**Fetched/compiled during configuration:** -- [syscall_intercept](https://github.com/pmem/syscall_intercept) to intercept syscalls -- [Taywee/args](https://github.com/Taywee/args) to parse server command line inputs -- [simdjson/simdjson](https://github.com/simdjson/simdjson) to parse json configuration files +- [syscall_intercept](https://github.com/pmem/syscall_intercept) - Intercept and handles LINUX system calls +- [Taywee/args](https://github.com/Taywee/args) - Parse user input arguments +- [simdjson/simdjson](https://github.com/simdjson/simdjson) - Parse fast JSON files +- [MTCL](https://github.com/ParaGroup/MTCL) - Provides abstractions over multiple communication backends -### Compile capio +### Compile CAPIO ```bash git clone https://github.com/High-Performance-IO/capio.git capio && cd capio @@ -38,103 +41,118 @@ cmake --build . -j$(nproc) sudo cmake --install . ``` -It is also possible to enable log in CAPIO, by defining `-DCAPIO_LOG=TRUE`. +To enable logging support, pass `-DCAPIO_LOG=TRUE` during the CMake configuration phase. + +--- + +## πŸ§‘β€πŸ’» Using CAPIO in Your Code + +Good news! You **don’t need to modify your application code**. Just follow these steps: + +### 1. Create a Configuration File *(optional but recommended)* + +Write a CAPIO-CL configuration file to inject streaming into your workflow. Refer to +the [CAPIO-CL Docs](https://capio.hpc4ai.it/docs/coord-language/) for details. + +### 2 Launch the workflow with CAPIO -## Use CAPIO in your code +To launch your workflow with capio you can follow two routes: -Good news! You don't need to modify your code to benefit from the features of CAPIO. You have only to do three steps ( -the first is optional). +#### A) Use `capiorun` for simplfied operations -1) Write a configuration file for injecting streaming capabilities to your workflow +You can simplify the execution of workflow steps with CAPIO using the `capiorun` utility. See the +[`capiorun` documentation](capiorun/readme.md) for usage and examples. `capiorun` provides an easier way to manage +daemon startup and environment preparation, so that the user do not need to manually prepare the environment. -2) Launch the CAPIO daemons with MPI passing the (eventual) configuration file as argument on the machines in which you - want to execute your program (one daemon for each node). If you desire to specify a custom folder - for capio, set `CAPIO_DIR` as a environment variable. - ```bash - [CAPIO_DIR=your_capiodir] capio_server -c conf.json - ``` +#### B) Manually launch CAPIO -> [!NOTE] -> if `CAPIO_DIR` is not specified when launching capio_server, it will default to the current working directory of -> capio_server. +Launch the CAPIO Daemons: start one daemon per node. Optionally set `CAPIO_DIR` to define the CAPIO mount point: -3) Launch your programs preloading the CAPIO shared library like this: - ```bash - CAPIO_DIR=your_capiodir \ - CAPIO_WORKFLOW_NAME=wfname \ - CAPIO_APP_NAME=appname \ - LD_PRELOAD=libcapio_posix.so \ - ./your_app - ``` +```bash +[CAPIO_DIR=your_capiodir] capio_server -c conf.json +``` + +> [!CAUTION] +> If `CAPIO_DIR` is not set, it defaults to the current working directory. + +You can now start your application. Just set the right environment variable and remember to set `LD_PRELOAD` to the +`libcapio_posix.so` intercepting library: + +```bash +CAPIO_DIR=your_capiodir +CAPIO_WORKFLOW_NAME=wfname +CAPIO_APP_NAME=appname +LD_PRELOAD=libcapio_posix.so +./your_app +``` -> [!WARNING] -> `CAPIO_DIR` must be specified when launching a program with the CAPIO library. if `CAPIO_DIR` is not specified, CAPIO -> will not intercept syscalls. +> [!CAUTION] +> if `CAPIO_APP_NAME` and `CAPIO_WORKFLOW_NAME` are not set (or are set but do not match the values present in the +> CAPIO-CL configuration file), CAPIO will not be able to operate correctly! -### Available environment variables +--- -CAPIO can be controlled through the usage of environment variables. The available variables are listed below: +## βš™οΈ Environment Variables -#### Global environment variable +### πŸ”„ Global -- `CAPIO_DIR` This environment variable tells to both server and application the mount point of capio; -- `CAPIO_LOG_LEVEL` this environment tells both server and application the log level to use. This variable works only - if `-DCAPIO_LOG=TRUE` was specified during cmake phase; -- `CAPIO_LOG_PREFIX` This environment variable is defined only for capio_posix applications and specifies the prefix of - the logfile name to which capio will log to. The default value is `posix_thread_`, which means that capio will log by - default to a set of files called `posix_thread_*.log`. An equivalent behaviour can be set on the capio server using - the `-l` option; -- `CAPIO_LOG_DIR` This environment variable is defined only for capio_posix applications and specifies the directory - name to which capio will be created. If this variable is not defined, capio will log by default to `capio_logs`. An - equivalent behaviour can be set on the capio server using the `-d` option; -- `CAPIO_CACHE_LINE_SIZE`: This environment variable controls the size of a single cache line. defaults to 256KB; +| Variable | Description | +|-------------------------|----------------------------------------------------| +| `CAPIO_DIR` | Shared mount point for server and application | +| `CAPIO_LOG_LEVEL` | Logging level (requires `-DCAPIO_LOG=TRUE`) | +| `CAPIO_LOG_PREFIX` | Log file name prefix (default: `posix_thread_`) | +| `CAPIO_LOG_DIR` | Directory for log files (default: `capio_logs`) | +| `CAPIO_CACHE_LINE_SIZE` | Size of a single CAPIO cache line (default: 256KB) | -#### Server only environment variable +### πŸ–₯️ Server-Only -- `CAPIO_METADATA_DIR`: This environmental variable controls the location of the metadata files used by CAPIO. it - defaults to CAPIO_DIR. BE CAREFUL to put this folder on a path that is accessible by all instances of the running - CAPIO servers. +| Variable | Description | +|----------------------|----------------------------------------------------------------------------| +| `CAPIO_METADATA_DIR` | Directory for metadata files. Defaults to `CAPIO_DIR`. Must be accessible. | -#### Posix only environment variable +### πŸ“ POSIX-Only (Mandatory) -> [!WARNING] -> The following variables are mandatory. If not provided to a posix, application, CAPIO will not be able to correctly -> handle the -> application, according to the specifications given from the json configuration file! +> ⚠️ These are required by CAPIO-POSIX. Without them, your app will not behave as configured in the JSON file. -- `CAPIO_WORKFLOW_NAME`: This environment variable is used to define the scope of a workflow for a given step. Needs to - be the same one as the field `"name"` inside the json configuration file; -- `CAPIO_APP_NAME`: This environment variable defines the app name within a workflow for a given step; +| Variable | Description | +|-----------------------|-------------------------------------------------| +| `CAPIO_WORKFLOW_NAME` | Must match `"name"` field in your configuration | +| `CAPIO_APP_NAME` | Name of the step within your workflow | -## How to inject streaming capabilities into your workflow +--- -You can find documentation as well as examples, on the official documentation page at +## πŸ“– Extended documentation -[Official documentation website](https://capio.hpc4ai.it/docs) +Documentation and examples are available on the official site: +🌐 [https://capio.hpc4ai.it/docs](https://capio.hpc4ai.it/docs) -## Report bugs + get help +--- -[Create a new issue](https://github.com/High-Performance-IO/capio/issues/new) +## 🐞 Report Bugs & Get Help -[Get help](capio.hpc4ai.it/docs) +- [Create an issue](https://github.com/High-Performance-IO/capio/issues/new) +- [Official Documentation](https://capio.hpc4ai.it/docs) +--- -## CAPIO Team +## πŸ‘₯ CAPIO Team -Made with :heart: by: +Made with ❀️ by: -Marco Edoardo Santimaria (Designer and maintainer) \ -Iacopo Colonnelli (Workflows expert and maintainer) \ -Massimo Torquati (Designer) \ -Marco Aldinucci (Designer) +- Marco Edoardo Santimaria – (Designer & Maintainer) +- Iacopo Colonnelli – (Workflow Support & Maintainer) +- Massimo Torquati – (Designer) +- Marco Aldinucci – (Designer) -Former members: +**Former Members:** -Alberto Riccardo Martinelli (designer and maintainer) +- Alberto Riccardo Martinelli – (Designer & Maintainer) -## Papers +--- -[![CAPIO](https://img.shields.io/badge/CAPIO-10.1109/HiPC58850.2023.00031-red)]([https://arxiv.org/abs/2206.10048](https://dx.doi.org/10.1109/HiPC58850.2023.00031)) +## πŸ“š Publications +[![CAPIO](https://img.shields.io/badge/CAPIO-10.1109/HiPC58850.2023.00031-red)](https://dx.doi.org/10.1109/HiPC58850.2023.00031) +[![](https://img.shields.io/badge/CAPIO--CL-10.1007%2Fs10766--025--00789--0-green?style=flat&logo=readthedocs)](https://doi.org/10.1007/s10766-025-00789-0) \ No newline at end of file