diff --git a/docs/guides/storage.md b/docs/guides/storage.md index 5a0b4ee2..864e80ff 100644 --- a/docs/guides/storage.md +++ b/docs/guides/storage.md @@ -111,7 +111,7 @@ To set up a default so all newly created folders and dirs inside or your desired ``` !!! info - For more information read the `setfacl` man page: `man setfacl`. + For more information read the `setfacl` man page: [`man setfacl`](https://linux.die.net/man/1/setfacl). [](){#ref-guides-storage-lustre} ## Lustre tuning @@ -127,7 +127,10 @@ The data itself is subdivided in blocks of size `` and is stored by O The block size and number of OSTs to use is defined by the striping settings, which are applied to a path, with new files and directories inheriting them from their parent directory. The `lfs getstripe ` command can be used to get information on the stripe settings of a path. For directories and empty files `lfs setstripe --stripe-count --stripe-size ` can be used to set the layout. -The simplest way to have the correct layout is to copy to a directory with the correct layout + +Striping settings on a directory are only applied to files added after the command is run. +Existing files retain their original layout unless explicitly changed using `lfs migrate `, which takes the same arguments as `lfs setstripe`. +The simplest way to have the correct layout is to copy to a directory with the correct layout. !!! tip "A block size of 4MB gives good throughput, without being overly big..." ... so it is a good choice when reading a file sequentially or in large chunks, but if one reads shorter chunks in random order it might be better to reduce the size, the performance will be smaller, but the performance of your application might actually increase. @@ -135,6 +138,7 @@ The simplest way to have the correct layout is to copy to a directory with the c !!! example "Settings for large files" + *Remember:* Settings only apply to files added to the directory after this command. ```console lfs setstripe --stripe-count -1 --stripe-size 4M ` ``` diff --git a/docs/software/communication/nccl-assets/config_v226.sh b/docs/software/communication/nccl-assets/config_v226.sh new file mode 100644 index 00000000..732c952e --- /dev/null +++ b/docs/software/communication/nccl-assets/config_v226.sh @@ -0,0 +1,48 @@ +export NCCL_VERSION=v2.26.2-1 +export AWS_OFI_NCCL_VERSION=v1.14.1 +export LIBFABRIC_VERSION=v2.2.0 +export NCCL_TEST_VERSION=v2.17.1 + +# --------------------------------------------------------------------------- +# +# Critical Values +# +# --------------------------------------------------------------------------- + +export NCCL_NET="AWS Libfabric" +export NCCL_NET_GDR_LEVEL=PHB +export FI_MR_CACHE_MONITOR=userfaultfd +export MPICH_GPU_SUPPORT_ENABLED=0 + +# Enable the "alternative rendezvous configuration" of Slingshot to avoid +# sporadic, catastrophic drops in performance +export FI_CXI_RDZV_PROTO=alt_read +export SBATCH_NETWORK=disable_rdzv_get + +# --------------------------------------------------------------------------- +# +# Recommended Values +# +# --------------------------------------------------------------------------- + +export FI_CXI_DEFAULT_CQ_SIZE=131072 +export FI_CXI_DEFAULT_TX_SIZE=32768 +export FI_CXI_DISABLE_HOST_REGISTER=1 +export FI_CXI_RDZV_EAGER_SIZE=0 + +# --------------------------------------------------------------------------- +# +# Debugging Values +# +# --------------------------------------------------------------------------- + +export NCCL_DEBUG=INFO +export NCCL_DEBUG_SUBSYS=INIT,BOOTSTRAP,ENV,TUNING + +# --------------------------------------------------------------------------- +# +# Enable CSCS NCCL Tuning Plugin +# +# --------------------------------------------------------------------------- + +export NCCL_TUNER_PLUGIN=cscs diff --git a/docs/software/communication/nccl-assets/nccl-plots-226.pdf b/docs/software/communication/nccl-assets/nccl-plots-226.pdf new file mode 100644 index 00000000..cf0e37f8 Binary files /dev/null and b/docs/software/communication/nccl-assets/nccl-plots-226.pdf differ diff --git a/docs/software/communication/nccl-assets/nccl-plots-226.png b/docs/software/communication/nccl-assets/nccl-plots-226.png new file mode 100644 index 00000000..a22977e6 Binary files /dev/null and b/docs/software/communication/nccl-assets/nccl-plots-226.png differ diff --git a/docs/software/communication/nccl-assets/nccl_tuner_v226.conf b/docs/software/communication/nccl-assets/nccl_tuner_v226.conf new file mode 100644 index 00000000..8feff41e --- /dev/null +++ b/docs/software/communication/nccl-assets/nccl_tuner_v226.conf @@ -0,0 +1,19 @@ +all_reduce,0,4194303,tree,simple,-1,2,8 +all_reduce,4194304,33554431,ring,simple,-1,2,8 +all_reduce,33554432,4294967295,tree,simple,-1,2,8 +# +all_reduce,0,4194303,tree,simple,-1,4,16 +all_reduce,4194304,4294967295,ring,simple,-1,4,16 +# +all_reduce,0,33554431,tree,simple,-1,8,32 +all_reduce,33554432,4294967295,ring,simple,-1,8,32 +# +all_reduce,0,67108863,tree,simple,-1,16,64 +all_reduce,67108864,4294967295,ring,simple,-1,16,64 +# +all_reduce,0,268435455,tree,simple,-1,32,128 +all_reduce,268435456,4294967295,ring,simple,-1,32,128 +# +all_reduce,536870912,4294967295,ring,simple,-1,64,256 +all_reduce,1073741824,4294967295,ring,simple,-1,128,512 +all_reduce,2147483648,4294967295,ring,simple,-1,256,1024 diff --git a/docs/software/communication/nccl.md b/docs/software/communication/nccl.md index 6a0068ad..78252109 100644 --- a/docs/software/communication/nccl.md +++ b/docs/software/communication/nccl.md @@ -67,3 +67,51 @@ While the container engine sets these automatically when using the NCCL hook, th ``` If you only set `NCCL_NET="ofi"`, NCCL may silently fail to load the plugin but fall back to the default implementation. + +## Expected performance + +This section covers the expected performance behavior of the [NCCL Tests benchmark](https://github.com/NVIDIA/nccl-tests) suite on Alps. +This information can be used as a reference for comparing with application behavior. +The [NCCL Stack Constellation Benchmarks](https://github.com/jpcoles-cscs/nccl-stack-constellation-benchmarks) can be used to reproduce this information and also build and run the tests within a user's own environment. + +=== "NCCL v2.26" + === "Plots" + [Download PDF](nccl-assets/nccl-plots-226.pdf) + ![NCCL v2.26 benchmark performance](nccl-assets/nccl-plots-226.png) + === "Environment Settings" + [Download settings](nccl-assets/config_v226.sh) + ```bash + --8<-- "docs/software/communication/nccl-assets/config_v226.sh" + ``` + === "Tuner parameters" + [Download parameters](nccl-assets/nccl_tuner_v226.conf) + ``` + --8<-- "docs/software/communication/nccl-assets/nccl_tuner_v226.conf" + ``` + +=== "NCCL v2.27" +=== "NCCL v2.28" + +## NCCL Tuner Plugin + +NCCL has internal logic to choose the most performant communication algorithm given collective, message size, number of ranks, and other system characteristics. +This logic has been optimized for the infiniband network and can perform suboptimally on the Slinghshot network of Alps. + +To achieve best results, it is necessary to use the NCCL Tuner Plugin along side a tuner configuration file. +A modified tuner plugin for Alps is included in a [forked version of NCCL](https://github.com/jpcoles-cscs/nccl). +The forked repository is only needed for building the tuner and is compatible with versions of NCCL >= 2.24 that support the `ncclTunerPlugin_v4` data structure. +CSCS has prepared example configuration files for use in these benchmarks and can be used as a reference point for application-specific tuning. + +To use the CSCS tuner, first download, build, and copy the library to a preferred location: +```console +git clone --branch 2.27.7-1-cscs-tuner git@github.com:jpcoles-cscs/nccl.git nccl-tuner-cscs/nccl +cd nccl-tuner-cscs/nccl/ext-tuner/example +make +cp libnccl-tuner-example.so $INSTALL_DIR/libnccl-tuner-cscs.so +``` +Then point NCCL to the tuner library: +```bash +export NCCL_TUNER_PLUGIN=$INSTALL_DIR/libnccl-tuner-cscs.so +``` + +