leggedrobotics
diff --git a/‎docs/complete-guide.md‎
Lines changed: 1331 additions & 3 deletions b/‎docs/complete-guide.md‎
Lines changed: 1331 additions & 3 deletions
diff --git a/‎docs/computing-guide.md‎
Lines changed: 622 additions & 0 deletions b/‎docs/computing-guide.md‎
Lines changed: 622 additions & 0 deletions
diff --git a/‎docs/data-management.md‎
Lines changed: 88 additions & 0 deletions b/‎docs/data-management.md‎
Lines changed: 88 additions & 0 deletions
diff --git a/‎docs/getting-started.md‎
Lines changed: 144 additions & 0 deletions b/‎docs/getting-started.md‎
Lines changed: 144 additions & 0 deletions
diff --git a/‎docs/index.md‎
Lines changed: 34 additions & 19 deletions b/‎docs/index.md‎
Lines changed: 34 additions & 19 deletions
@@ -0,0 +1,88 @@
+# Data Management on Euler
+
+Effective data management is critical when working on the Euler Cluster, particularly for machine learning workflows that involve large datasets and model outputs. This section explains the available storage options and their proper usage.
+
+---
+
+## 📁 Home Directory (`/cluster/home/$USER`)
+
+- **Quota**: 45 GB  
+- **Inodes**: ~450,000 files  
+- **Persistence**: Permanent (not purged)
+- **Use Case**: Ideal for storing source code, small configuration files, scripts, and lightweight development tools.
+
+---
+
+## ⚡ Scratch Directory (`/cluster/scratch/$USER` or `$SCRATCH`)
+
+- **Quota**: 2.5 TB  
+- **Inodes**: 1 M  
+- **Persistence**: Temporary (data is deleted if not accessed for ~15 days)
+- **Use Case**: For storing datasets and temporary training outputs.
+- **Recommended Dataset storage format**: Use **tar/zip/[HDF5](https://www.hdfgroup.org/solutions/hdf5/)/[WebDataset](https://github.com/webdataset/webdataset)**.
+
+
+---
+
+## 📦 Project Directory (`/cluster/project/rsl/$USER`)
+
+- **Quota**: ≤ 75 GB  
+- **Inodes**: ~300,000  
+- **Use Case**: Conda environments, software packages
+
+---
+
+## 📂 Work Directory (`/cluster/work/rsl/$USER`)
+
+- **Quota**: ≤ 150 GB  
+- **Inodes**: ~30,000  
+- **Use Case**: Saving results, large output files, tar files, singularity images. Avoid storing too many small files.
+
+> In exceptional cases we can approve more storage space. For this, ask your supervisor to contact `patelm@ethz.ch`.
+
+## 📂 Local Scratch Directory (`$TMPDIR`)
+
+- **Quota**: upto 800 GB  
+- **Inodes**: Very High 
+- **Use Case**: Datasets and containers for a training run. 
+
+## ❗ Quota Violations:
+
+- You shall receive an email if you violate any of the above limits. 
+- You can type `lquota` in the terminal to check your used storage space for `Home` and `Scratch` directories. 
+- For usage of `Project` and `Work` directories you can run: 
+   ```bash 
+   (head -n 5 && grep -w $USER) < /cluster/work/rsl/.rsl_user_data_usage.txt
+   (head -n 5 && grep -w $USER) < /cluster/project/rsl/.rsl_user_data_usage.txt
+   ```
+   Note: This wont show the per-user quota limit which is enforced by RSL ! Refer to the table below for the quota limits.
+
+### 🎯 FAQ: What is the difference between the `Project` and `Work` Directories and why is it necessary to make use of both?
+
+Basically, both `Project` and `Work` are persistent storages (meaning the data is not deleted automatically); however, the use cases are different. When you have lots of small files, for example, conda environments, you should store them in the `Project` directory as it has a higher capacity for # of inodes. On the other hand, when you have larger files such as model checkpoints, singularity containers and results you should store them in the `Work` directory as the storage capacity is higher.
+
+### 🎯 FAQ: What is Local Scratch Directory (`$TMPDIR`) ?
+
+Whenever you run a compute job, you can also ask for a certain amount of local scratch space (`$TMPDIR`) which allocates space on a local hard drive. The main advantage of the local scratch is, that it is located directly inside the compute nodes and not attached via the network. Thus it is highly recommended to copy over your singularity container / datasets to `$TMPDIR` and then use that for the trainings. Detailed workflows for the trainings are provided later in this guide.
+
+---
+
+## 📊 Summary Table of Storage Locations
+
+| Storage Location            | Max Inodes | Max Size per User | Purged | Recommended Use Case                               |
+|----------------------------|------------|----------------|--------|----------------------------------------------------|
+| `/cluster/home/$USER`      | ~450,000   | 45 GB          |      No     | Code, config, small files                          |
+| `/cluster/scratch/$USER`   | 1 M    | 2.5 TB             |      Yes (older than 15 days)    | Datasets, training data, temporary usage           |
+| `/cluster/project/rsl/$USER`     | 300,000    | 75 GB    |      No     | Conda envs, software packages             |
+| `/cluster/work/rsl/$USER`        | 30,000     | 150 GB   |      No     | Large result files, model checkpoints, Singularity containers,             |
+| `$TMPDIR`        | very high     | Upto 800 GB   |      Yes (at end of job)     |     Training Datasets, Singularity Images         |
+
+---
+
+## 💡 Best Practices
+
+1. **Use the right storage for the right purpose** - Don't waste home directory space on large files
+2. **Compress datasets** - Use tar/zip to reduce inode usage
+3. **Clean up regularly** - Remove old data from scratch before it's auto-deleted
+4. **Monitor your usage** - Check quotas regularly with `lquota`
+5. **Use `$TMPDIR` for active jobs** - Copy data to local scratch for faster I/O during computation
@@ -0,0 +1,144 @@
+# Getting Started with Euler
+
+This guide helps new users access and begin working on the **Euler Cluster** at ETH Zurich, specifically for members of the **RSL group (es_hutter)**.
+
+## 📌 Table of Contents
+
+1. [Access Requirements](#access-requirements)
+2. [Connecting to Euler via SSH](#connecting-to-euler-via-ssh)
+   - [Basic Login](#basic-login)
+   - [Setting Up SSH Keys](#setting-up-ssh-keys-recommended)
+   - [Using an SSH Config File](#using-an-ssh-config-file)
+3. [Verifying Access to the RSL Shareholder Group](#verifying-access-to-the-rsl-shareholder-group)
+
+---
+
+## ✅ Access Requirements
+
+In order to get access to the cluster, kindly fill up the following [form](https://forms.gle/UsiGkXUmo9YyNHsH8). If you are a member of RSL, directly message Manthan Patel to add you to the cluster. The access is approved twice a week (Tuesdays and Fridays).
+
+Before proceeding, make sure you have:
+
+- A valid **nethz username and password** (ETH Zurich credentials)
+- Access to a **terminal** (Linux/macOS or Git Bash on Windows)
+- (Optional) Some familiarity with command-line tools
+
+---
+
+## 🔐 Connecting to Euler via SSH
+
+You'll connect to Euler using the Secure Shell (SSH) protocol. This allows you to log into a remote machine securely from your local computer.
+
+---
+
+### Basic Login
+
+To log into the Euler cluster, open a terminal and type:
+
+```bash
+ssh <your_nethz_username>@euler.ethz.ch
+```
+
+Replace `<your_nethz_username>` with your actual ETH Zurich login.
+
+You will be asked to enter your ETH Zurich password. If the login is successful, you'll be connected to a login node on the Euler cluster.
+
+---
+
+### Setting Up SSH Keys (Recommended)
+
+To avoid typing your password every time and to increase security, it is recommended to use SSH key-based authentication.
+
+#### Step-by-Step Instructions:
+
+1. **Generate an SSH key pair** on your local machine (if not already created):
+
+   ```bash
+   ssh-keygen -t ed25519 -C "<your_email>@ethz.ch"
+   ```
+
+   - Press Enter to accept the default file location (usually `~/.ssh/id_ed25519`).
+   - When prompted for a passphrase, you can choose to set one or leave it empty.
+
+2. **Copy your public key to Euler** using this command:
+
+   ```bash
+   ssh-copy-id <your_nethz_username>@euler.ethz.ch
+   ```
+
+   - You'll be asked to enter your ETH password one last time.
+   - This command installs your public key in the `~/.ssh/authorized_keys` file on Euler.
+
+Now, you should be able to log in without typing your password.
+
+---
+
+### Using an SSH Config File
+
+To make your SSH workflow easier, especially if you frequently access Euler, create or edit the `~/.ssh/config` file on your local machine.
+
+#### Example Configuration:
+
+```sshconfig
+Host euler
+  HostName euler.ethz.ch
+  User <your_nethz_username>
+  Compression yes
+  ForwardX11 yes
+  IdentityFile ~/.ssh/id_ed25519
+```
+
+- Replace `<your_nethz_username>` with your actual ETH username.
+- Save and close the file.
+
+Now, instead of typing the full SSH command, you can simply connect using:
+
+```bash
+ssh euler
+```
+
+---
+
+## 🧾 Verifying Access to the RSL Shareholder Group
+
+Once you are logged into the Euler cluster, it's important to confirm that you have been added to the appropriate shareholder group. This ensures you can access the computing resources allocated to your research group (in this case, the RSL group).
+
+---
+
+### 🔍 How to Check Your Group Membership
+
+1. While connected to Euler (after logging in via SSH), run the following command in the terminal:
+
+   ```bash
+   my_share_info
+   ```
+
+2. If everything is correctly set up, you should see output similar to the following:
+
+   ```
+   You are a member of the es_hutter shareholder group on Euler.
+   ```
+
+3. This message confirms that you are part of the `es_hutter` group, which is the shareholder group for the RSL lab.
+
+4. Create your user directories for storage by using the following command
+   ```bash 
+   mkdir /cluster/project/rsl/$USER
+   mkdir /cluster/work/rsl/$USER
+   ```
+
+---
+
+### ❗ If You Do NOT See This Message:
+
+- Double-check with your supervisor whether you've been added to the group.
+- It may take a few hours after being added for the change to propagate.
+
+---
+
+## Next Steps
+
+Once you have verified your access:
+- Learn about [Data Management](data-management.md) on Euler
+- Set up [Python Environments](python-environments.md)
+- Start [Computing](computing-guide.md) with interactive sessions or batch jobs
@@ -1,24 +1,39 @@
 # RSL Euler Cluster Guide
 
-## 📚 Documentation
-
-### Complete Guide
-**[Open Complete Guide →](complete-guide.md)**
-
-The complete guide contains:
-- Access Requirements
-- Connecting to Euler via SSH
-- Verifying RSL Group Membership  
-- Data Management on Euler
-- Setting Up Miniconda Environments
-- Interactive Sessions
-- Sample Sbatch Scripts
-- Sample Training Workflow
-- Container Workflow
-- Useful Links
-
-### Other Resources
-- **[Container Workflow](container-workflow.md)** - Docker/Singularity detailed guide
+## 🚀 Quick Access to All Sections
+
+### 1. Getting Started
+**[Access Requirements, SSH Setup, Verification →](getting-started.md)**
+- Getting cluster access
+- Setting up SSH connection
+- Verifying RSL group membership
+
+### 2. Data Management
+**[Storage Locations and Quotas →](data-management.md)**
+- Home, Scratch, Project, Work directories
+- Storage quotas and best practices
+- Using local scratch ($TMPDIR)
+
+### 3. Python Environments & ML Training
+**[Miniconda Setup and Training Workflows →](python-environments.md)**
+- Installing and managing Miniconda
+- Creating conda environments
+- Complete ML training workflow
+
+### 4. Computing on Euler
+**[Interactive Sessions and Batch Jobs →](computing-guide.md)**
+- Requesting interactive sessions
+- Writing and submitting SLURM job scripts
+- GPU selection and multi-GPU training
+
+### 5. Container Workflow
+**[Docker/Singularity Guide →](container-workflow.md)**
+- Building Docker containers
+- Converting to Singularity
+- Running containerized jobs
+
+### 📚 Additional Resources
+- **[Complete Reference Guide](complete-guide.md)** - All sections in one document
 - **[Scripts Library](scripts.md)** - Ready-to-use job scripts
 - **[Troubleshooting](troubleshooting.md)** - Common issues and solutions