From d7f62bb588d6dd0100e2ba46af8414a56589c35c Mon Sep 17 00:00:00 2001
From: Jenny <jennyempawi95@gmail.com>
Date: Wed, 15 Apr 2026 18:18:32 -0400
Subject: [PATCH 1/3] Update getting started page with pixi-based setup

Rewrite the getting started notebook to provide a step-by-step guide
for new users (install SoS, install pixi, clone the repo, run a
pipeline). Replace the Singularity/Docker container workflow with
pixi-managed environments via the StatFunGen pixi-setup installer,
and link to the Wang Lab pixi documentation. Keep the content
platform-agnostic rather than tied to a specific HPC cluster.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 code/xqtl_protocol_demo.ipynb | 127 +++++++++++++++++++++++++++++++---
 1 file changed, 116 insertions(+), 11 deletions(-)

diff --git a/code/xqtl_protocol_demo.ipynb b/code/xqtl_protocol_demo.ipynb
index 8b1bc7a6c..9c28720ce 100644
--- a/code/xqtl_protocol_demo.ipynb
+++ b/code/xqtl_protocol_demo.ipynb
@@ -10,7 +10,98 @@
    "source": [
     "# Illustration of xQTL protocol\n",
     "\n",
-    "This notebook illustrates the computational protocols available from this repository for the detection and analysis of molecular QTLs (xQTLs). A minimal toy data-set consisting of 49 de-identified samples are used for the analysis."
+    "This notebook illustrates the computational protocols available from this repository for the detection and analysis of molecular QTLs (xQTLs). A minimal toy data-set consisting of 49 de-identified samples are used for the analysis.\n",
+    "\n",
+    "The sections below provide a **step-by-step guide** for new users to set up the software environment, clone the protocol repository, and run their first xQTL analysis (for example, generating xQTLs with TensorQTL). These instructions are intended to be portable — they work on a personal Linux/macOS workstation as well as on any HPC cluster (SLURM, LSF, SGE, etc.). Nothing below is tied to a specific institution's cluster.\n",
+    "\n",
+    "## Step-by-step getting started guide\n",
+    "\n",
+    "The workflow below gets a new user from a fresh shell to running a pipeline in four steps:\n",
+    "\n",
+    "1. **Install SoS** (Script of Scripts) in a conda environment — this is the workflow engine that drives the xQTL pipelines.\n",
+    "2. **Install the xQTL project's software stack with pixi** — this replaces the previous Singularity/Docker container workflow.\n",
+    "3. **Clone the `xqtl-protocol` repository**.\n",
+    "4. **Run an example pipeline** (e.g. TensorQTL) using the toy dataset.\n",
+    "\n",
+    "### 1. Install SoS in a conda environment\n",
+    "\n",
+    "Follow the official SoS instructions for a conda-based installation: <https://vatlab.github.io/sos-docs/running.html#Conda-installation>.\n",
+    "\n",
+    "A typical minimal install looks like:\n",
+    "\n",
+    "```bash\n",
+    "conda create -n sos -c conda-forge sos sos-pbs sos-bash sos-python sos-r sos-notebook jupyterlab-sos -y\n",
+    "conda activate sos\n",
+    "```\n",
+    "\n",
+    "You only need to do this once per machine / user account.\n",
+    "\n",
+    "### 2. Install the xQTL software stack with pixi\n",
+    "\n",
+    "We no longer use Singularity/Docker containers as the default software environment. Instead, the xQTL protocol's dependencies (R, Python, TensorQTL, bcftools, plink, etc.) are installed through [pixi](https://pixi.sh/) using the StatFunGen `pixi-setup` installer. This gives you a self-contained, reproducible environment that works on any Linux/macOS system.\n",
+    "\n",
+    "For the full, up-to-date installation guide (including options, troubleshooting, and advanced configuration), see the Wang Lab documentation:\n",
+    "\n",
+    "- <https://wanggroup.org/hpc/docs/software-setup-conda> — *Advanced Software Setup with Pixi*\n",
+    "- <https://wanggroup.org/orientation/jupyter-setup.html> — *Software environment setup overview*\n",
+    "\n",
+    "**Choose an install location with enough free space.** On shared systems (HPC, lab servers), home directories often have small quotas and inode limits that a full pixi environment (≈7–35 GB, 100k–350k files) will exhaust. Pick a path on a larger filesystem — a lab/project directory or scratch space. On a personal laptop, the default `$HOME/.pixi` is fine.\n",
+    "\n",
+    "**Point pixi at that location** by temporarily overriding `HOME` and prepending pixi to `PATH` before running the installer. Replace `/your_pixi_install_path` with the directory you chose:\n",
+    "\n",
+    "```bash\n",
+    "# Direct pixi to install into a location with enough space\n",
+    "export HOME=\\\"/your_pixi_install_path\\\"\n",
+    "\n",
+    "# Make the pixi binary discoverable for this shell\n",
+    "export PATH=\\\"/your_pixi_install_path/.pixi/bin:$PATH\\\"\n",
+    "```\n",
+    "\n",
+    "**On HPC clusters**, the installer can be memory-intensive — run it from an interactive compute node (for example, request ≥50 GB of memory) rather than the login node.\n",
+    "\n",
+    "**Run the installer.** This downloads and installs pixi along with the StatFunGen-curated set of environments used by the xQTL protocol:\n",
+    "\n",
+    "```bash\n",
+    "curl -fsSL https://raw.githubusercontent.com/StatFunGen/pixi-setup/refs/heads/main/pixi-setup.sh | bash\n",
+    "```\n",
+    "\n",
+    "The installer will prompt you for the installation path and for an installation type (`minimal` vs. `full`). Choose `full` if you plan to run the complete xQTL protocol, since it includes TensorQTL, bcftools, plink, Seurat, and the Bioconductor packages the pipeline relies on.\n",
+    "\n",
+    "After installation, restart your shell (or `source ~/.bashrc`) so that `pixi` is on your `PATH`.\n",
+    "\n",
+    "To install additional packages later:\n",
+    "\n",
+    "```bash\n",
+    "# Add a Python package\n",
+    "pixi global install -c conda-forge --environment python <package>\n",
+    "\n",
+    "# Add an R package\n",
+    "pixi global install -c conda-forge --environment r-base r-<package>\n",
+    "```\n",
+    "\n",
+    "### 3. Clone the xqtl-protocol repository\n",
+    "\n",
+    "```bash\n",
+    "git clone https://github.com/StatFunGen/xqtl-protocol.git\n",
+    "cd xqtl-protocol\n",
+    "```\n",
+    "\n",
+    "The `code/` directory contains the SoS notebooks that implement each mini-protocol. The `pipeline/` directory contains the underlying SoS workflows called by those notebooks.\n",
+    "\n",
+    "### 4. Run an example pipeline\n",
+    "\n",
+    "With the `sos` conda environment activated and `pixi` on your `PATH`, you can invoke any of the mini-protocols documented on this website. For example, to run TensorQTL on the toy dataset:\n",
+    "\n",
+    "```bash\n",
+    "conda activate sos\n",
+    "sos run pipeline/TensorQTL.ipynb cis \\\\\n",
+    "    --genotype-file /path/to/genotype.bed \\\\\n",
+    "    --phenotype-file /path/to/phenotype.bed.gz \\\\\n",
+    "    --covariate-file /path/to/covariates.tsv \\\\\n",
+    "    --cwd output/\n",
+    "```\n",
+    "\n",
+    "See the [Command Generator](https://statfungen.github.io/xqtl-protocol/code/commands_generator/eQTL_analysis_commands.html) for a ready-to-run set of commands that covers the full pipeline end-to-end.\""
    ]
   },
   {
@@ -106,13 +197,22 @@
    "id": "complete-extent",
    "metadata": {},
    "source": [
-    "## Software environment: use Singularity containers\n",
+    "## Software environment: pixi-managed environments\n",
+    "\n",
+    "Analysis documented on this website uses software installed through [pixi](https://pixi.sh/) via the StatFunGen [`pixi-setup`](https://github.com/StatFunGen/pixi-setup) installer. See the *Step-by-step getting started guide* above for installation, and the Wang Lab documentation for more detail: <https://wanggroup.org/hpc/docs/software-setup-conda>.\n",
     "\n",
-    "Analysis documented on this website are best performed using containers we provide either through `singularity` (recommended) or `docker`, via the `--container` option pointing to a container image file. For example, `--container oras://ghcr.io/statfungen/tensorqtl_apptainer:latest` uses a singularity image to perform analysis for QTL association mapping via software `TensorQTL`. If you drop the `--container` option then you will rely on software installed on your computer to perform the analysis. \n",
+    "Once `pixi` is installed and on your `PATH`, the pipelines will pick up the tools they need (TensorQTL, bcftools, plink, R packages, etc.) directly — there is no `--container` flag to pass and no Singularity/Docker image to download. If you previously used the `--container oras://ghcr.io/statfungen/...` option, it is no longer required.\n",
     "\n",
-    "#### Troubleshooting\n",
+    "### Troubleshooting\n",
     "\n",
-    "If you run into errors relating to R libraries while including the `--container` option then you may need to unload your R packages locally before running the sos commands. For example, this error:\n",
+    "If you run into errors loading R libraries when the pipeline calls into R (for example, a locally installed R package shadowing the pixi-managed one), unset the user R library paths before invoking `sos`:\n",
+    "\n",
+    "```bash\n",
+    "export R_LIBS=\\\"\\\"\n",
+    "export R_LIBS_USER=\\\"\\\"\n",
+    "```\n",
+    "\n",
+    "For example, an error such as:\n",
     "\n",
     "```\n",
     "Error in dyn.load(file, DLLpath = DLLPath, ...):\n",
@@ -120,16 +220,21 @@
     "libicui18n.so.63: cannot open shared object file: No such file or directory\n",
     "```\n",
     "\n",
-    "May be fixed by running this before the sos commands are run:\n",
+    "is typically fixed by the two `export` statements above — they force R to use only the libraries provided by the pixi `r-base` environment.\n",
     "\n",
-    "```\n",
-    "export R_LIBS=\"\"\n",
-    "export R_LIBS_USER=\"\"\n",
+    "If a required package is missing from the pixi environment, install it with:\n",
+    "\n",
+    "```bash\n",
+    "# Python package\n",
+    "pixi global install -c conda-forge --environment python <package>\n",
+    "\n",
+    "# R package\n",
+    "pixi global install -c conda-forge --environment r-base r-<package>\n",
     "```\n",
     "\n",
     "## Analyses on High Performance Computing clusters\n",
     "\n",
-    "The protocol example shown above performs analysis on a desktop workstation, as a demonstration. Typically the analyses should be performed on HPC cluster environments. This can be achieved via [SoS Remote Tasks](https://vatlab.github.io/sos-docs/doc/user_guide/task_statement.html) on [configured host computers](https://vatlab.github.io/sos-docs/doc/user_guide/host_setup.html). We provide this [toy example for running SoS pipeline on a typical HPC cluster environment](https://github.com/statfungen/xqtl-protocol/blob/main/code/misc/Job_Example.ipynb). First time users are encouraged to try it out in order to help setting up the computational environment necessary to run the analysis in this protocol."
+    "The protocol example shown above performs analysis on a desktop workstation, as a demonstration. Typically the analyses should be performed on HPC cluster environments. This can be achieved via [SoS Remote Tasks](https://vatlab.github.io/sos-docs/doc/user_guide/task_statement.html) on [configured host computers](https://vatlab.github.io/sos-docs/doc/user_guide/host_setup.html). The mechanism is cluster-agnostic and works with SLURM, LSF, SGE, PBS/Torque, and other common schedulers. We provide a [toy example for running SoS pipeline on a typical HPC cluster environment](https://github.com/statfungen/xqtl-protocol/blob/main/code/misc/Job_Example.ipynb); first-time users are encouraged to try it out to help set up the computational environment necessary to run the analyses in this protocol.\""
    ]
   },
   {
@@ -167,4 +272,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}
\ No newline at end of file

From 43601adea4fa9f431906ee7180c7f010f527b182 Mon Sep 17 00:00:00 2001
From: Jenny <jennyempawi95@gmail.com>
Date: Wed, 15 Apr 2026 18:59:51 -0400
Subject: [PATCH 2/3] Rewrite getting-started page: cavatica-style layout,
 wanggroup.org SoS setup

---
 code/xqtl_protocol_demo.ipynb | 430 ++++++++++++++++++++++------------
 1 file changed, 284 insertions(+), 146 deletions(-)

diff --git a/code/xqtl_protocol_demo.ipynb b/code/xqtl_protocol_demo.ipynb
index 9c28720ce..7a4628c3e 100644
--- a/code/xqtl_protocol_demo.ipynb
+++ b/code/xqtl_protocol_demo.ipynb
@@ -2,245 +2,381 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "extensive-communication",
-   "metadata": {
-    "kernel": "SoS",
-    "tags": []
-   },
+   "metadata": {},
    "source": [
-    "# Illustration of xQTL protocol\n",
+    "# Getting Started with the FunGen-xQTL Protocol\n",
+    "\n",
+    "**A reproducible, end-to-end computational protocol for molecular quantitative trait loci (xQTL) analysis \u2014 bulk, single-cell, and multi-omic.**\n",
+    "\n",
+    "The FunGen-xQTL protocol brings together data preprocessing, QTL discovery, fine-mapping, multivariate and colocalization analyses, and integration with GWAS into a single reproducible workflow. Every step is implemented as a [Script of Scripts (SoS)](https://vatlab.github.io/sos-docs/) workflow that runs the same way on a laptop, a compute cluster, or the cloud.\n",
+    "\n",
+    "> **New here?** This page is a guided on-ramp. In about an hour you can install the environment, clone the repo, download a small demo dataset, and run your first pipeline.\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## At a Glance\n",
+    "\n",
+    "| | |\n",
+    "|---|---|\n",
+    "| **Code repository** | [StatFunGen/xqtl-protocol](https://github.com/StatFunGen/xqtl-protocol) |\n",
+    "| **Project website** | [statfungen.github.io/xqtl-protocol](https://statfungen.github.io/xqtl-protocol/) |\n",
+    "| **Workflow engine** | [SoS (Script of Scripts)](https://vatlab.github.io/sos-docs/) |\n",
+    "| **Package manager** | [pixi](https://pixi.sh/) via [StatFunGen/pixi-setup](https://github.com/StatFunGen/pixi-setup) |\n",
+    "| **Lab environment guide** | [wanggroup.org \u2014 Software Setup](https://wanggroup.org/hpc/docs/software-setup-conda) |\n",
+    "| **Supported OS** | Linux, macOS 11+. Windows via [WSL](https://learn.microsoft.com/windows/wsl/install) |\n",
+    "| **HPC schedulers** | SLURM, LSF, SGE, PBS/Torque |\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## How to Use This Page\n",
+    "\n",
+    "1. **Check prerequisites** \u2014 make sure you're on a supported OS and have enough disk/memory.\n",
+    "2. **Install the environment** \u2014 run the `pixi-setup.sh` installer, then add SoS.\n",
+    "3. **Get the code and data** \u2014 clone the protocol repo and download the demo dataset.\n",
+    "4. **Run your first workflow** \u2014 execute a small example to confirm everything works.\n",
+    "5. **Go further** \u2014 follow the links at the bottom of this page to dive into the pipelines that matter for your project.\n",
+    "\n",
+    "Each section below is self-contained; if something is already installed you can skip ahead.\n",
+    "\n",
+    "---\n",
     "\n",
-    "This notebook illustrates the computational protocols available from this repository for the detection and analysis of molecular QTLs (xQTLs). A minimal toy data-set consisting of 49 de-identified samples are used for the analysis.\n",
+    "## 1. Prerequisites\n",
     "\n",
-    "The sections below provide a **step-by-step guide** for new users to set up the software environment, clone the protocol repository, and run their first xQTL analysis (for example, generating xQTLs with TensorQTL). These instructions are intended to be portable — they work on a personal Linux/macOS workstation as well as on any HPC cluster (SLURM, LSF, SGE, etc.). Nothing below is tied to a specific institution's cluster.\n",
+    "Before you start, confirm the following:\n",
     "\n",
-    "## Step-by-step getting started guide\n",
+    "| Requirement | Minimum | Recommended |\n",
+    "|---|---|---|\n",
+    "| **Operating system** | Linux or macOS 11+ (Windows users: install [WSL2](https://learn.microsoft.com/windows/wsl/install) first) | Ubuntu 22.04+ or macOS 13+ |\n",
+    "| **Shell** | Bash or Zsh | Zsh on macOS, Bash on Linux/HPC |\n",
+    "| **Memory** | 16 GB | 50 GB+ for the pixi installer on HPC |\n",
+    "| **Disk space** | 10 GB (minimal install) | 50 GB+ (full bioinformatics stack + demo data) |\n",
+    "| **Network access** | GitHub, conda-forge, synapse.org | Same |\n",
+    "| **Git** | Any recent version | 2.30+ |\n",
     "\n",
-    "The workflow below gets a new user from a fresh shell to running a pipeline in four steps:\n",
+    "> **HPC tip:** Request an interactive compute node with at least 50 GB of memory before running the installer. Login nodes often kill large installs.\n",
+    ">\n",
+    "> ```bash\n",
+    "> # SLURM\n",
+    "> srun --mem=50G --pty bash\n",
+    ">\n",
+    "> # LSF\n",
+    "> bsub -Is -M 50000 -n 4 bash\n",
+    "> ```\n",
     "\n",
-    "1. **Install SoS** (Script of Scripts) in a conda environment — this is the workflow engine that drives the xQTL pipelines.\n",
-    "2. **Install the xQTL project's software stack with pixi** — this replaces the previous Singularity/Docker container workflow.\n",
-    "3. **Clone the `xqtl-protocol` repository**.\n",
-    "4. **Run an example pipeline** (e.g. TensorQTL) using the toy dataset.\n",
+    "---\n",
     "\n",
-    "### 1. Install SoS in a conda environment\n",
+    "## 2. Install the Computing Environment\n",
     "\n",
-    "Follow the official SoS instructions for a conda-based installation: <https://vatlab.github.io/sos-docs/running.html#Conda-installation>.\n",
+    "We manage the entire software stack with [**pixi**](https://pixi.sh/), a fast, reproducible package manager for conda channels. The [StatFunGen/pixi-setup](https://github.com/StatFunGen/pixi-setup) installer wires everything up for you \u2014 Python, R, JupyterLab, bioinformatics tools, and more \u2014 and is the **same installer the Wang Lab uses**. The canonical walkthrough lives at [wanggroup.org/hpc/docs/software-setup-conda](https://wanggroup.org/hpc/docs/software-setup-conda); the steps below mirror it.\n",
     "\n",
-    "A typical minimal install looks like:\n",
+    "### 2.1 (Optional) Purge old conda/mamba installs\n",
+    "\n",
+    "If you already have conflicting `conda`, `mamba`, `micromamba`, or a prior `pixi` on the machine, start from a clean slate:\n",
     "\n",
     "```bash\n",
-    "conda create -n sos -c conda-forge sos sos-pbs sos-bash sos-python sos-r sos-notebook jupyterlab-sos -y\n",
-    "conda activate sos\n",
+    "rm -rf ~/.mamba ~/.conda ~/.anaconda ~/.pixi ~/.jupyter \\\n",
+    "       ~/micromamba ~/.mambarc ~/.local/share/jupyter/\n",
     "```\n",
     "\n",
-    "You only need to do this once per machine / user account.\n",
+    "Skip this step if you are happy with your current setup or sharing the machine with others.\n",
+    "\n",
+    "### 2.2 Run the pixi-setup installer\n",
+    "\n",
+    "From an interactive shell (or a compute node on HPC):\n",
     "\n",
-    "### 2. Install the xQTL software stack with pixi\n",
+    "```bash\n",
+    "curl -fsSL https://raw.githubusercontent.com/StatFunGen/pixi-setup/refs/heads/main/pixi-setup.sh \\\n",
+    "     -o pixi-setup.sh\n",
+    "bash pixi-setup.sh\n",
+    "```\n",
     "\n",
-    "We no longer use Singularity/Docker containers as the default software environment. Instead, the xQTL protocol's dependencies (R, Python, TensorQTL, bcftools, plink, etc.) are installed through [pixi](https://pixi.sh/) using the StatFunGen `pixi-setup` installer. This gives you a self-contained, reproducible environment that works on any Linux/macOS system.\n",
+    "You'll be prompted for two things.\n",
     "\n",
-    "For the full, up-to-date installation guide (including options, troubleshooting, and advanced configuration), see the Wang Lab documentation:\n",
+    "**Installation path** \u2014 where pixi stores environments and the package cache.\n",
     "\n",
-    "- <https://wanggroup.org/hpc/docs/software-setup-conda> — *Advanced Software Setup with Pixi*\n",
-    "- <https://wanggroup.org/orientation/jupyter-setup.html> — *Software environment setup overview*\n",
+    "| Setting | When to use |\n",
+    "|---|---|\n",
+    "| `$HOME/.pixi` (default) | Laptops and workstations with plenty of `$HOME` space |\n",
+    "| `/lab/$USER/.pixi` or scratch | HPC systems with small `$HOME` quotas (recommended) |\n",
     "\n",
-    "**Choose an install location with enough free space.** On shared systems (HPC, lab servers), home directories often have small quotas and inode limits that a full pixi environment (≈7–35 GB, 100k–350k files) will exhaust. Pick a path on a larger filesystem — a lab/project directory or scratch space. On a personal laptop, the default `$HOME/.pixi` is fine.\n",
+    "**Installation type** \u2014 pick based on what you plan to do.\n",
     "\n",
-    "**Point pixi at that location** by temporarily overriding `HOME` and prepending pixi to `PATH` before running the installer. Replace `/your_pixi_install_path` with the directory you chose:\n",
+    "| Type | Size | Files | Includes |\n",
+    "|---|---|---|---|\n",
+    "| **1. minimal** | ~5 GB | ~100k | CLI tools, Python data-science stack, JupyterLab, base R (tidyverse, devtools, IRkernel, languageserver, \u2026) |\n",
+    "| **2. full** | ~35 GB | ~350k | Everything in minimal **plus** samtools, bcftools, plink/plink2, GATK4, STAR, RSEM, FastQC, bedtools, VEP, Seurat, tensorQTL, and Bioconductor packages |\n",
     "\n",
-    "```bash\n",
-    "# Direct pixi to install into a location with enough space\n",
-    "export HOME=\\\"/your_pixi_install_path\\\"\n",
+    "> **Rule of thumb:** pick **minimal** for xQTL runs where you pass in pre-processed data; pick **full** if you'll also do upstream QC, alignment, or single-cell preprocessing.\n",
     "\n",
-    "# Make the pixi binary discoverable for this shell\n",
-    "export PATH=\\\"/your_pixi_install_path/.pixi/bin:$PATH\\\"\n",
+    "After the installer finishes:\n",
+    "\n",
+    "```bash\n",
+    "source ~/.bashrc      # or ~/.zshrc on macOS\n",
+    "pixi --version        # should print a version number\n",
     "```\n",
     "\n",
-    "**On HPC clusters**, the installer can be memory-intensive — run it from an interactive compute node (for example, request ≥50 GB of memory) rather than the login node.\n",
+    "### 2.3 Add SoS on top of pixi (wanggroup.org approach)\n",
     "\n",
-    "**Run the installer.** This downloads and installs pixi along with the StatFunGen-curated set of environments used by the xQTL protocol:\n",
+    "Pixi gives you Python, R, and JupyterLab, but the xQTL protocol is written as **SoS workflows**, so we add the SoS suite as pixi global packages. This is exactly the Wang Lab convention from [wanggroup.org](https://wanggroup.org/orientation/jupyter-setup.html) \u2014 SoS lives next to pixi's `python` environment and is available from any shell.\n",
     "\n",
     "```bash\n",
-    "curl -fsSL https://raw.githubusercontent.com/StatFunGen/pixi-setup/refs/heads/main/pixi-setup.sh | bash\n",
+    "pixi global install \\\n",
+    "    --environment python \\\n",
+    "    -c conda-forge \\\n",
+    "    sos sos-pbs sos-notebook jupyterlab-sos \\\n",
+    "    sos-bash sos-python sos-r\n",
+    "\n",
+    "# Register the SoS Jupyter kernel\n",
+    "pixi run -e python python -m sos_notebook.install\n",
     "```\n",
     "\n",
-    "The installer will prompt you for the installation path and for an installation type (`minimal` vs. `full`). Choose `full` if you plan to run the complete xQTL protocol, since it includes TensorQTL, bcftools, plink, Seurat, and the Bioconductor packages the pipeline relies on.\n",
+    "Verify the install:\n",
+    "\n",
+    "```bash\n",
+    "sos --version               # SoS workflow CLI\n",
+    "jupyter kernelspec list     # should list 'sos' among the kernels\n",
+    "```\n",
     "\n",
-    "After installation, restart your shell (or `source ~/.bashrc`) so that `pixi` is on your `PATH`.\n",
+    "> **Why separate from the pixi installer?** The minimal and full pixi-setup bundles intentionally ship a general-purpose Python + R stack. Adding SoS as its own step keeps the base install portable and makes it easy to upgrade SoS independently.\n",
     "\n",
-    "To install additional packages later:\n",
+    "### 2.4 Install Additional Software (optional)\n",
+    "\n",
+    "Once pixi is configured, installing more tools is a one-liner. Consult [anaconda.org](https://anaconda.org/search) for exact package names.\n",
     "\n",
     "```bash\n",
-    "# Add a Python package\n",
-    "pixi global install -c conda-forge --environment python <package>\n",
+    "# Bioinformatics CLI tools\n",
+    "pixi global install -c bioconda samtools bcftools plink2\n",
+    "\n",
+    "# R package (into the r-base environment)\n",
+    "pixi global install -c conda-forge --environment r-base r-pacman\n",
     "\n",
-    "# Add an R package\n",
-    "pixi global install -c conda-forge --environment r-base r-<package>\n",
+    "# Python package (into the python environment)\n",
+    "pixi global install -c conda-forge --environment python seaborn\n",
+    "\n",
+    "# Update all packages in an environment\n",
+    "pixi global update r-base\n",
+    "pixi global update python\n",
     "```\n",
     "\n",
-    "### 3. Clone the xqtl-protocol repository\n",
+    "### 2.5 (Optional) Collaboration-friendly permissions\n",
+    "\n",
+    "If you share a lab directory, make new files group-writable by default. Add to `~/.bashrc`:\n",
+    "\n",
+    "```bash\n",
+    "umask 002\n",
+    "```\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## 3. Get the Code\n",
+    "\n",
+    "Clone the protocol repository \u2014 this is where every pipeline, config, and example lives.\n",
     "\n",
     "```bash\n",
     "git clone https://github.com/StatFunGen/xqtl-protocol.git\n",
     "cd xqtl-protocol\n",
     "```\n",
     "\n",
-    "The `code/` directory contains the SoS notebooks that implement each mini-protocol. The `pipeline/` directory contains the underlying SoS workflows called by those notebooks.\n",
+    "### Repository layout\n",
     "\n",
-    "### 4. Run an example pipeline\n",
+    "| Path | What it is |\n",
+    "|---|---|\n",
+    "| `code/` | Notebook-based documentation (this page lives here) |\n",
+    "| `pipeline/` | SoS workflow entry points (symlinks into `code/`) \u2014 this is what you run |\n",
+    "| `website/` | JupyterBook sources for [statfungen.github.io/xqtl-protocol](https://statfungen.github.io/xqtl-protocol/) |\n",
+    "| `data/` | Small example inputs and configuration templates |\n",
+    "| `container/` | Legacy Singularity/Docker recipes (kept for reference) |\n",
     "\n",
-    "With the `sos` conda environment activated and `pixi` on your `PATH`, you can invoke any of the mini-protocols documented on this website. For example, to run TensorQTL on the toy dataset:\n",
+    "### The pipelines you'll use most\n",
+    "\n",
+    "| Entry point | Purpose |\n",
+    "|---|---|\n",
+    "| `pipeline/1_xqtl_association.ipynb` | End-to-end xQTL association pipeline |\n",
+    "| `pipeline/TensorQTL.ipynb` | TensorQTL cis/trans mapping |\n",
+    "| `pipeline/1_phenotype_preprocessing.ipynb` | Phenotype QC, normalization, covariate correction |\n",
+    "| `pipeline/2_genotype_preprocessing.ipynb` | Genotype QC, imputation, reference alignment |\n",
+    "| `pipeline/4_covariates_preprocessing.ipynb` | PEER / hidden-factor covariate generation |\n",
+    "| `pipeline/eQTL_analysis_commands.ipynb` | Copy-paste command reference for eQTL runs |\n",
+    "| `pipeline/Job_Example.ipynb` | Template for submitting pipelines to an HPC scheduler |\n",
+    "\n",
+    "Browse the full index on the [pipelines page of the website](https://statfungen.github.io/xqtl-protocol/).\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## 4. Run Your First Workflow\n",
+    "\n",
+    "Once pixi + SoS are installed and the repo is cloned, confirm SoS can see the workflows:\n",
     "\n",
     "```bash\n",
-    "conda activate sos\n",
-    "sos run pipeline/TensorQTL.ipynb cis \\\\\n",
-    "    --genotype-file /path/to/genotype.bed \\\\\n",
-    "    --phenotype-file /path/to/phenotype.bed.gz \\\\\n",
-    "    --covariate-file /path/to/covariates.tsv \\\\\n",
-    "    --cwd output/\n",
+    "cd xqtl-protocol\n",
+    "sos run pipeline/1_xqtl_association.ipynb -h\n",
     "```\n",
     "\n",
-    "See the [Command Generator](https://statfungen.github.io/xqtl-protocol/code/commands_generator/eQTL_analysis_commands.html) for a ready-to-run set of commands that covers the full pipeline end-to-end.\""
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "4a3d482b-982f-47b2-b3ef-fc6775a74e33",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "id": "physical-postage",
-   "metadata": {
-    "kernel": "SoS",
-    "tags": []
-   },
-   "source": [
-    "## Analysis\n",
+    "You should see a list of available workflow steps and options. If you do, you're ready to launch real runs.\n",
     "\n",
-    "Please visit [the homepage of the protocol website](https://statfungen.github.io/xqtl-protocol/) for the general background on this resource, in particular the [How to use the resource](https://statfungen.github.io/xqtl-protocol/README.html#how-to-use-the-resource) section. To perform a complete analysis from molecular phenotype quantification to xQTL discovery, please conduct your analysis in the order listed below, each link contains a mini-protocol for a specific task. All commands documented in each mini-protocol should be executed in the command line environment.\n",
+    "A minimal cis-QTL smoke test (requires demo data \u2014 see next section):\n",
     "\n",
-    "### Molecular Phenotype Quantification\n",
+    "```bash\n",
+    "sos run pipeline/TensorQTL.ipynb cis \\\n",
+    "    --genotype-file data/example/genotype.bed \\\n",
+    "    --phenotype-file data/example/phenotype.bed.gz \\\n",
+    "    --covariate-file data/example/covariates.tsv \\\n",
+    "    --cwd output/demo_tensorqtl\n",
+    "```\n",
     "\n",
-    "Molecular phenotypic data is required for the generation of QTLs. We support bulk RNA-Seq, methylation and splicing phenotypes in our pipeline. Multiple [reference data](https://statfungen.github.io/xqtl-protocol/code/reference_data/reference_data.html#) files are required before molecular phenotypes are quantified in samples. These include, but are not limited to, reference genomes, gene annotations, variant annotations, linkage disequilibirum data and topologically associated domains. [Quantification of gene expression](https://statfungen.github.io/xqtl-protocol/code/molecular_phenotypes/bulk_expression.html) is conducted with either RNA-SeQC for gene-level counts, or RSEM for transcript-level counts. [Quantification of alternative splicing events](https://statfungen.github.io/xqtl-protocol/code/molecular_phenotypes/splicing.html) is conducted with leafcutter2 to identify alternatively excised introns. [Quantification of DNA methylation](https://statfungen.github.io/xqtl-protocol/code/molecular_phenotypes/methylation.html) is done using SeSAMe. Each of these molecular phenotypes then undergo phenotype specific quality control and normalization.\n",
+    "> **Tip:** Every pipeline supports `-h` and `--help`. SoS also prints the exact shell commands it runs, which is handy for debugging and for learning what the pipeline does under the hood.\n",
     "\n",
-    "### Data Pre-Processing\n",
+    "---\n",
     "\n",
-    "[Preprocessing of genotype data](https://statfungen.github.io/xqtl-protocol/code/data_preprocessing/genotype_preprocessing.html) begins with the application of variant filters using bcftools. VCF files are then converted to plink format so that kinship analyses may be performed to identify unrelated individuals. Genetic principal components are then generated for unrelated samples and genotype files are formatted for later generation of quantitative trait loci. \n",
+    "## 5. Where to Go Next\n",
     "\n",
-    "[Preprocessing of phenotypic data](https://statfungen.github.io/xqtl-protocol/code/data_preprocessing/phenotype_preprocessing.html) begins with annotation of features, if required. Missing entries may then be imputed using a variety of methods included in the pipeline. Last, the phenotypes are formatted for later generation of quantitative trait loci. \n",
+    "**Explore the protocol**\n",
     "\n",
-    "[Preprocessing of covariates](https://statfungen.github.io/xqtl-protocol/code/data_preprocessing/covariate_preprocessing.html) begins with the merging of phenotypic data with previously generated genetic principal components. The merged data is then used to calculate hidden factors which will later be used as additional covariates. \n",
+    "- [Full documentation site](https://statfungen.github.io/xqtl-protocol/) \u2014 browsable pipeline reference with examples\n",
+    "- [Pipeline index](https://statfungen.github.io/xqtl-protocol/pipeline/) \u2014 each step with inputs, outputs, and parameters\n",
+    "- [HPC quick start (Wang Lab)](https://wanggroup.org/hpc/docs/quick-start/) \u2014 how to request nodes and submit jobs\n",
+    "- [SoS documentation](https://vatlab.github.io/sos-docs/) \u2014 workflow engine reference and tutorials\n",
     "\n",
-    "### QTL Association Analysis\n",
+    "**Get help**\n",
     "\n",
-    "[QTL association analysis](https://statfungen.github.io/xqtl-protocol/code/association_scan/qtl_association_testing.html) is conducted with TensorQTL. We include options for cis or trans analysis, with options to include interaction terms. [Hierarchical multiple testing](https://statfungen.github.io/xqtl-protocol/code/association_scan/qtl_association_postprocessing.html) may then be applied to the results to adjust p-values. \n",
+    "- [Ask a question (Wang Lab guide)](https://wanggroup.org/orientation/questions.html) \u2014 how to file a good issue\n",
+    "- [GitHub issues](https://github.com/StatFunGen/xqtl-protocol/issues) \u2014 bugs, feature requests, protocol questions\n",
+    "- [pixi-setup issues](https://github.com/StatFunGen/pixi-setup/issues) \u2014 environment / install problems\n",
     "\n",
-    "### Integrative Analysis\n",
+    "**Contribute**\n",
     "\n",
-    "We include methods to conduct [TWAS](https://statfungen.github.io/xqtl-protocol/code/pecotmr_integration/twas_ctwas.html) in our pipeline to identify genes associated with complex traits. \n",
+    "- Fork [StatFunGen/xqtl-protocol](https://github.com/StatFunGen/xqtl-protocol), make changes on a feature branch, open a PR\n",
+    "- Follow the [reproducible research guide](https://wanggroup.org/orientation/reproducible-research.html) for notebook and commit conventions\n",
     "\n",
-    "Our pipeline includes multiple methods for fine-mapping of QTLs. [Univariate fine-mapping and TWAS with SuSiE](https://statfungen.github.io/xqtl-protocol/code/mnm_analysis/univariate_fine_mapping_twas_vignette.html) generates TWAS weights and credible sets using SuSiE. [Regression with summary statistics](https://statfungen.github.io/xqtl-protocol/code/mnm_analysis/summary_stats_finemapping_vignette.html) allows for the inclusion of summary statistics from GWAS in SuSiE finemapping. [Univariate fine-mapping of functional data](https://statfungen.github.io/xqtl-protocol/code/mnm_analysis/univariate_fine_mapping_fsusie_vignette.html) uses epigenomic data to fine-map with fSuSiE. \n",
+    "---\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 6. Analysis Overview\n",
     "\n",
-    "We also include method for [colocalization analysis](https://statfungen.github.io/xqtl-protocol/code/pecotmr_integration/SuSiE_enloc.html). This starts with the generation of prior probabilities followed by pairwise colocalization analysis of xQTL and GWAS fine-mapping results to identifies shared causal variants. We also include an alternative method, [colocboost](https://statfungen.github.io/xqtl-protocol/code/mnm_analysis/mnm_methods/colocboost.html), to identify shared genetic variants influencing multiple molecular traits. \n",
+    "The FunGen-xQTL protocol is modular. Each numbered pipeline is a self-contained SoS notebook that can run independently or as part of the full xQTL workflow. At a high level, the flow is:\n",
     "\n",
-    "We utilize an [excess of overlap](https://statfungen.github.io/xqtl-protocol/code/enrichment/eoo_enrichment.html) method to evaluate the enrichment of significant variants within specific genomic annotations. [Pathway enrichment analysis](https://statfungen.github.io/xqtl-protocol/code/enrichment/gsea.html) identifies biological pathways that are statistically overrepresented in a given gene set, giving information on  potential biological functions, disease relevance, or regulatory mechanisms associated with the gene set. [Stratified LD Score Regression](https://statfungen.github.io/xqtl-protocol/code/enrichment/sldsc_enrichment.html) (S-LDSC) is used to quantify the contribution of different genomic functional annotations to the heritability of complex traits and assess their statistical significance. By integrating GWAS summary statistics with genome annotations, S-LDSC distinguishes true polygenic signals from confounding effects.\n",
+    "**Preprocess inputs \u2192 Discover QTLs \u2192 Fine-map & integrate \u2192 Report & share**\n",
     "\n",
+    "| Stage | Pipelines | What happens |\n",
+    "|---|---|---|\n",
+    "| **1. Phenotype preprocessing** | `1_phenotype_preprocessing.ipynb` | QC, normalization, batch correction for bulk RNA-seq, proteomics, methylation, or single-cell pseudo-bulk |\n",
+    "| **2. Genotype preprocessing** | `2_genotype_preprocessing.ipynb` | Variant-level QC, imputation, ancestry alignment, PCA |\n",
+    "| **3. Covariate preprocessing** | `4_covariates_preprocessing.ipynb` | Known + hidden covariates (PEER, surrogate variables) |\n",
+    "| **4. QTL discovery** | `TensorQTL.ipynb`, `APEX.ipynb`, `1_xqtl_association.ipynb` | Cis/trans scans, interaction QTLs, trans-eQTL screens |\n",
+    "| **5. Fine-mapping & multivariate** | `SuSiE.ipynb`, `mvSuSiE.ipynb`, `fSuSiE.ipynb` | Credible sets across contexts and tissues |\n",
+    "| **6. Colocalization & integration** | `coloc.ipynb`, `cTWAS.ipynb`, `GWAS_integration.ipynb` | Link xQTLs to GWAS and causal gene nomination |\n",
     "\n",
-    "\n"
+    "All pipelines share a common config layout, so once you've learned one you can read the others quickly.\n"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "dietary-vector",
-   "metadata": {
-    "kernel": "SoS",
-    "tags": []
-   },
+   "metadata": {},
    "source": [
-    "## Data\n",
+    "## 7. Downloading Example Data\n",
     "\n",
-    "For record keeping: preparation of the demo dataset is documented [on this page](https://github.com/cumc/fungen-xqtl-analysis/tree/main/analysis/Wang_Columbia/ROSMAP/MWE) --- this is a private repository accessible to FunGen-xQTL analysis working group members.\n",
+    "A small demo dataset lives on [Synapse](https://www.synapse.org/#!Synapse:syn36416559). You'll need a free Synapse account, then install the client (already available via pixi):\n",
     "\n",
-    "For protocols listed in this page, downloaded required input data in [Synapse](https://www.synapse.org/#!Synapse:syn36416601). \n",
-    "* To be able downloading the data, first create user account on [Synapse Login](https://www.synapse.org/). Username and password will be required when downloading\n",
-    "* Downloading required installing of Synapse API Clients, type `pip install synapseclient` in terminal or Command Prompt to install the Python package. Details list [on this page](https://help.synapse.org/docs/Installing-Synapse-API-Clients.1985249668.html).\n",
-    "* Each folder in different level has unique Synapse ID, which allowing you to download only some folders or files within the entire folder.\n",
+    "```bash\n",
+    "# One-time install (uses pixi's python env)\n",
+    "pixi global install -c conda-forge --environment python synapseclient\n",
     "\n",
-    "To download the test data for section \"Bulk RNA-seq molecular phenotype quantification\", please use the following Python codes,\n",
+    "# Log in (prompts for your Synapse PAT)\n",
+    "synapse login -p\n",
     "\n",
-    "```\n",
-    "import synapseclient \n",
-    "import synapseutils \n",
-    "syn = synapseclient.Synapse()\n",
-    "syn.login(\"your username on synapse.org\",\"your password on synapse.org\")\n",
-    "files = synapseutils.syncFromSynapse(syn, 'syn53174239', path=\"./\")\n",
+    "# Download the demo bundle (~2 GB)\n",
+    "synapse get -r syn36416559 --downloadLocation data/example/\n",
     "```\n",
     "\n",
-    "To download the test data for section \"xQTL association analysis\", please use the following Python codes, \n",
-    "\n",
-    "```\n",
-    "import synapseclient \n",
-    "import synapseutils \n",
-    "syn = synapseclient.Synapse()\n",
-    "syn.login(\"your username on synapse.org\",\"your password on synapse.org\")\n",
-    "files = synapseutils.syncFromSynapse(syn, 'syn52369482', path=\"./\")\n",
-    "```"
+    "For the full ROSMAP / MSBB production data (controlled access), request access through [AD Knowledge Portal](https://adknowledgeportal.synapse.org/) first.\n"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "complete-extent",
    "metadata": {},
    "source": [
-    "## Software environment: pixi-managed environments\n",
-    "\n",
-    "Analysis documented on this website uses software installed through [pixi](https://pixi.sh/) via the StatFunGen [`pixi-setup`](https://github.com/StatFunGen/pixi-setup) installer. See the *Step-by-step getting started guide* above for installation, and the Wang Lab documentation for more detail: <https://wanggroup.org/hpc/docs/software-setup-conda>.\n",
+    "## 8. Troubleshooting\n",
     "\n",
-    "Once `pixi` is installed and on your `PATH`, the pipelines will pick up the tools they need (TensorQTL, bcftools, plink, R packages, etc.) directly — there is no `--container` flag to pass and no Singularity/Docker image to download. If you previously used the `--container oras://ghcr.io/statfungen/...` option, it is no longer required.\n",
+    "A handful of issues come up often. If one of these matches, try the fix before opening an issue.\n",
     "\n",
-    "### Troubleshooting\n",
-    "\n",
-    "If you run into errors loading R libraries when the pipeline calls into R (for example, a locally installed R package shadowing the pixi-managed one), unset the user R library paths before invoking `sos`:\n",
+    "**`pixi: command not found` after install**\n",
     "\n",
     "```bash\n",
-    "export R_LIBS=\\\"\\\"\n",
-    "export R_LIBS_USER=\\\"\\\"\n",
+    "source ~/.bashrc      # Linux / HPC\n",
+    "source ~/.zshrc       # macOS\n",
+    "# or open a fresh terminal\n",
     "```\n",
     "\n",
-    "For example, an error such as:\n",
+    "**Installer gets killed on HPC**\n",
+    "Request more memory (\u2265 50 GB) and run on a compute node, not the login node:\n",
     "\n",
-    "```\n",
-    "Error in dyn.load(file, DLLpath = DLLPath, ...):\n",
-    "unable to load shared object '$PATH/R/x86_64-pc-linux-gnu-library/4.2/stringi/libs/stringi.so':\n",
-    "libicui18n.so.63: cannot open shared object file: No such file or directory\n",
+    "```bash\n",
+    "srun --mem=50G --pty bash\n",
+    "bash pixi-setup.sh\n",
     "```\n",
     "\n",
-    "is typically fixed by the two `export` statements above — they force R to use only the libraries provided by the pixi `r-base` environment.\n",
+    "**`sos: command not found`**\n",
+    "SoS was not installed on top of pixi. Re-run step 2.3.\n",
     "\n",
-    "If a required package is missing from the pixi environment, install it with:\n",
+    "**R library conflicts**\n",
+    "Conda-forge R packages do not mix well with `install.packages()` builds. Prefer `pixi global install --environment r-base r-<pkg>`. If you must use CRAN, stick to pure-R packages.\n",
+    "\n",
+    "**`ModuleNotFoundError` during a pipeline**\n",
+    "Install the missing package into pixi's `python` env:\n",
     "\n",
     "```bash\n",
-    "# Python package\n",
     "pixi global install -c conda-forge --environment python <package>\n",
+    "```\n",
     "\n",
-    "# R package\n",
-    "pixi global install -c conda-forge --environment r-base r-<package>\n",
+    "**GitHub is unreachable from your network**\n",
+    "Use the Gitee mirror documented at [wanggroup.org/hpc/docs/software-setup-conda](https://wanggroup.org/hpc/docs/software-setup-conda) (see \"Users in China\").\n",
+    "\n",
+    "**File permissions on shared directories**\n",
+    "Add `umask 002` to your `~/.bashrc` so new files are group-writable.\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## 9. Running on HPC\n",
+    "\n",
+    "The protocol plays nicely with SoS's built-in task queue support for **SLURM**, **LSF**, **SGE**, and **PBS/Torque**. See `pipeline/Job_Example.ipynb` for a template and the [SoS task queue docs](https://vatlab.github.io/sos-docs/doc/user_guide/task_statement.html) for configuration.\n",
+    "\n",
+    "A typical SLURM submission looks like:\n",
+    "\n",
+    "```bash\n",
+    "sos run pipeline/TensorQTL.ipynb cis \\\n",
+    "    --genotype-file ... --phenotype-file ... --covariate-file ... \\\n",
+    "    --cwd output/run01 \\\n",
+    "    -q slurm -c config/hpc_slurm.yml \\\n",
+    "    -J 20            # up to 20 concurrent jobs\n",
     "```\n",
     "\n",
-    "## Analyses on High Performance Computing clusters\n",
+    "The `config/hpc_slurm.yml` file controls partitions, walltime, and memory per task. Start from the template in the repo and adapt it to your cluster.\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## 10. Citing the Protocol\n",
+    "\n",
+    "If you use the FunGen-xQTL protocol in a publication, please cite:\n",
+    "\n",
+    "> Cao *et al.*, *A computational protocol for molecular QTL analysis integrating GWAS.* See [CITATION.md](https://github.com/StatFunGen/xqtl-protocol/blob/main/CITATION.md) in the repository for the current preferred citation and BibTeX entry.\n",
     "\n",
-    "The protocol example shown above performs analysis on a desktop workstation, as a demonstration. Typically the analyses should be performed on HPC cluster environments. This can be achieved via [SoS Remote Tasks](https://vatlab.github.io/sos-docs/doc/user_guide/task_statement.html) on [configured host computers](https://vatlab.github.io/sos-docs/doc/user_guide/host_setup.html). The mechanism is cluster-agnostic and works with SLURM, LSF, SGE, PBS/Torque, and other common schedulers. We provide a [toy example for running SoS pipeline on a typical HPC cluster environment](https://github.com/statfungen/xqtl-protocol/blob/main/code/misc/Job_Example.ipynb); first-time users are encouraged to try it out to help set up the computational environment necessary to run the analyses in this protocol.\""
+    "And drop us a line \u2014 we love hearing how the protocol is being used.\n"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "17f0cb95-9663-4c3f-a911-caa5f2d130a4",
    "metadata": {},
    "outputs": [],
    "source": []
@@ -248,15 +384,17 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Bash",
-   "language": "bash",
-   "name": "bash"
+   "display_name": "SoS",
+   "language": "sos",
+   "name": "sos"
   },
   "language_info": {
-   "codemirror_mode": "shell",
-   "file_extension": ".sh",
-   "mimetype": "text/x-sh",
-   "name": "bash"
+   "codemirror_mode": "sos",
+   "file_extension": ".sos",
+   "mimetype": "text/x-sos",
+   "name": "sos",
+   "nbconvert_exporter": "sos_notebook.converter.SoS_Exporter",
+   "pygments_lexer": "sos"
   },
   "sos": {
    "kernels": [
@@ -267,9 +405,9 @@
      ""
     ]
    ],
-   "version": "0.22.6"
+   "version": "0.24.4"
   }
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
\ No newline at end of file
+}

From 2b5dc4a2314c97fa6d99d74afea612a1ca19b392 Mon Sep 17 00:00:00 2001
From: Jenny <jennyempawi95@gmail.com>
Date: Wed, 15 Apr 2026 19:08:52 -0400
Subject: [PATCH 3/3] Rewrite getting-started page: cavatica-style layout,
 wanggroup.org SoS setup

---
 code/xqtl_protocol_demo.ipynb | 425 +++++++++++++++-------------------
 1 file changed, 181 insertions(+), 244 deletions(-)

diff --git a/code/xqtl_protocol_demo.ipynb b/code/xqtl_protocol_demo.ipynb
index 7a4628c3e..3dbb2df65 100644
--- a/code/xqtl_protocol_demo.ipynb
+++ b/code/xqtl_protocol_demo.ipynb
@@ -4,374 +4,311 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Getting Started with the FunGen-xQTL Protocol\n",
+    "# Getting Started\n",
     "\n",
-    "**A reproducible, end-to-end computational protocol for molecular quantitative trait loci (xQTL) analysis \u2014 bulk, single-cell, and multi-omic.**\n",
+    "**A reproducible pipeline for molecular QTL analysis \u2014 from raw genotypes and phenotypes through discovery, fine-mapping, and integration with GWAS.**\n",
     "\n",
-    "The FunGen-xQTL protocol brings together data preprocessing, QTL discovery, fine-mapping, multivariate and colocalization analyses, and integration with GWAS into a single reproducible workflow. Every step is implemented as a [Script of Scripts (SoS)](https://vatlab.github.io/sos-docs/) workflow that runs the same way on a laptop, a compute cluster, or the cloud.\n",
+    "This guide takes you from a clean machine to your first successful run in about an hour.\n",
     "\n",
-    "> **New here?** This page is a guided on-ramp. In about an hour you can install the environment, clone the repo, download a small demo dataset, and run your first pipeline.\n",
+    "::::{grid} 1 1 3 3\n",
+    ":gutter: 3\n",
     "\n",
-    "---\n",
-    "\n",
-    "## At a Glance\n",
+    ":::{grid-item-card} \ud83d\udce6 Install\n",
+    ":link: #step-1-install-pixi\n",
+    "One installer for Python, R, JupyterLab, and bioinformatics tools via [pixi](https://pixi.sh/).\n",
+    ":::\n",
     "\n",
-    "| | |\n",
-    "|---|---|\n",
-    "| **Code repository** | [StatFunGen/xqtl-protocol](https://github.com/StatFunGen/xqtl-protocol) |\n",
-    "| **Project website** | [statfungen.github.io/xqtl-protocol](https://statfungen.github.io/xqtl-protocol/) |\n",
-    "| **Workflow engine** | [SoS (Script of Scripts)](https://vatlab.github.io/sos-docs/) |\n",
-    "| **Package manager** | [pixi](https://pixi.sh/) via [StatFunGen/pixi-setup](https://github.com/StatFunGen/pixi-setup) |\n",
-    "| **Lab environment guide** | [wanggroup.org \u2014 Software Setup](https://wanggroup.org/hpc/docs/software-setup-conda) |\n",
-    "| **Supported OS** | Linux, macOS 11+. Windows via [WSL](https://learn.microsoft.com/windows/wsl/install) |\n",
-    "| **HPC schedulers** | SLURM, LSF, SGE, PBS/Torque |\n",
+    ":::{grid-item-card} \ud83e\uddec Run\n",
+    ":link: #step-5-run-your-first-workflow\n",
+    "Clone the repo, grab demo data, and launch a cis-QTL scan.\n",
+    ":::\n",
     "\n",
-    "---\n",
+    ":::{grid-item-card} \ud83d\ude80 Go Further\n",
+    ":link: #what-to-do-next\n",
+    "Fine-mapping, multivariate analysis, GWAS integration, HPC templates.\n",
+    ":::\n",
     "\n",
-    "## How to Use This Page\n",
+    "::::\n",
     "\n",
-    "1. **Check prerequisites** \u2014 make sure you're on a supported OS and have enough disk/memory.\n",
-    "2. **Install the environment** \u2014 run the `pixi-setup.sh` installer, then add SoS.\n",
-    "3. **Get the code and data** \u2014 clone the protocol repo and download the demo dataset.\n",
-    "4. **Run your first workflow** \u2014 execute a small example to confirm everything works.\n",
-    "5. **Go further** \u2014 follow the links at the bottom of this page to dive into the pipelines that matter for your project.\n",
-    "\n",
-    "Each section below is self-contained; if something is already installed you can skip ahead.\n",
     "\n",
     "---\n",
     "\n",
-    "## 1. Prerequisites\n",
+    "## Before You Start\n",
     "\n",
-    "Before you start, confirm the following:\n",
+    "You'll need a Linux or macOS shell. Windows users: install [WSL2](https://learn.microsoft.com/windows/wsl/install) first.\n",
     "\n",
     "| Requirement | Minimum | Recommended |\n",
     "|---|---|---|\n",
-    "| **Operating system** | Linux or macOS 11+ (Windows users: install [WSL2](https://learn.microsoft.com/windows/wsl/install) first) | Ubuntu 22.04+ or macOS 13+ |\n",
-    "| **Shell** | Bash or Zsh | Zsh on macOS, Bash on Linux/HPC |\n",
-    "| **Memory** | 16 GB | 50 GB+ for the pixi installer on HPC |\n",
-    "| **Disk space** | 10 GB (minimal install) | 50 GB+ (full bioinformatics stack + demo data) |\n",
-    "| **Network access** | GitHub, conda-forge, synapse.org | Same |\n",
-    "| **Git** | Any recent version | 2.30+ |\n",
-    "\n",
-    "> **HPC tip:** Request an interactive compute node with at least 50 GB of memory before running the installer. Login nodes often kill large installs.\n",
-    ">\n",
-    "> ```bash\n",
-    "> # SLURM\n",
-    "> srun --mem=50G --pty bash\n",
-    ">\n",
-    "> # LSF\n",
-    "> bsub -Is -M 50000 -n 4 bash\n",
-    "> ```\n",
-    "\n",
-    "---\n",
+    "| Disk space | 10 GB (minimal install) | 40 GB (full bioinformatics stack) |\n",
+    "| Memory | 16 GB | 50 GB+ on HPC for the installer |\n",
+    "| Network | GitHub, conda-forge, synapse.org | Same |\n",
+    "| Git | Any recent version | 2.30+ |\n",
     "\n",
-    "## 2. Install the Computing Environment\n",
-    "\n",
-    "We manage the entire software stack with [**pixi**](https://pixi.sh/), a fast, reproducible package manager for conda channels. The [StatFunGen/pixi-setup](https://github.com/StatFunGen/pixi-setup) installer wires everything up for you \u2014 Python, R, JupyterLab, bioinformatics tools, and more \u2014 and is the **same installer the Wang Lab uses**. The canonical walkthrough lives at [wanggroup.org/hpc/docs/software-setup-conda](https://wanggroup.org/hpc/docs/software-setup-conda); the steps below mirror it.\n",
-    "\n",
-    "### 2.1 (Optional) Purge old conda/mamba installs\n",
-    "\n",
-    "If you already have conflicting `conda`, `mamba`, `micromamba`, or a prior `pixi` on the machine, start from a clean slate:\n",
+    ":::{admonition} On HPC? Start on a compute node.\n",
+    ":class: tip\n",
+    "The installer is memory-hungry and login nodes will kill it. Grab an interactive session first:\n",
     "\n",
     "```bash\n",
-    "rm -rf ~/.mamba ~/.conda ~/.anaconda ~/.pixi ~/.jupyter \\\n",
-    "       ~/micromamba ~/.mambarc ~/.local/share/jupyter/\n",
+    "srun --mem=50G --pty bash          # SLURM\n",
+    "bsub -Is -M 50000 -n 4 bash        # LSF\n",
     "```\n",
+    ":::\n",
     "\n",
-    "Skip this step if you are happy with your current setup or sharing the machine with others.\n",
     "\n",
-    "### 2.2 Run the pixi-setup installer\n",
+    "---\n",
     "\n",
-    "From an interactive shell (or a compute node on HPC):\n",
+    "## Step 1 \u2014 Install pixi\n",
+    "\n",
+    "We manage every dependency \u2014 Python, R, JupyterLab, bioinformatics tools \u2014 with [pixi](https://pixi.sh/). One installer sets it all up.\n",
     "\n",
     "```bash\n",
-    "curl -fsSL https://raw.githubusercontent.com/StatFunGen/pixi-setup/refs/heads/main/pixi-setup.sh \\\n",
-    "     -o pixi-setup.sh\n",
+    "curl -fsSL https://raw.githubusercontent.com/StatFunGen/pixi-setup/refs/heads/main/pixi-setup.sh -o pixi-setup.sh\n",
     "bash pixi-setup.sh\n",
     "```\n",
     "\n",
-    "You'll be prompted for two things.\n",
+    "The installer will prompt you for two choices:\n",
     "\n",
-    "**Installation path** \u2014 where pixi stores environments and the package cache.\n",
+    ":::{dropdown} 1. Installation path\n",
+    ":open:\n",
+    "Where pixi stores environments and the package cache.\n",
     "\n",
     "| Setting | When to use |\n",
     "|---|---|\n",
-    "| `$HOME/.pixi` (default) | Laptops and workstations with plenty of `$HOME` space |\n",
-    "| `/lab/$USER/.pixi` or scratch | HPC systems with small `$HOME` quotas (recommended) |\n",
+    "| `$HOME/.pixi` (default) | Laptops and workstations with plenty of home-directory space |\n",
+    "| `/lab/$USER/.pixi` or scratch | HPC systems with strict home-directory quotas |\n",
+    "\n",
+    ":::\n",
     "\n",
-    "**Installation type** \u2014 pick based on what you plan to do.\n",
+    ":::{dropdown} 2. Installation type\n",
+    ":open:\n",
+    "Pick based on what you plan to do.\n",
     "\n",
     "| Type | Size | Files | Includes |\n",
     "|---|---|---|---|\n",
-    "| **1. minimal** | ~5 GB | ~100k | CLI tools, Python data-science stack, JupyterLab, base R (tidyverse, devtools, IRkernel, languageserver, \u2026) |\n",
-    "| **2. full** | ~35 GB | ~350k | Everything in minimal **plus** samtools, bcftools, plink/plink2, GATK4, STAR, RSEM, FastQC, bedtools, VEP, Seurat, tensorQTL, and Bioconductor packages |\n",
+    "| **1. minimal** | ~5 GB | ~100k | CLI tools, Python data-science stack, JupyterLab, base R |\n",
+    "| **2. full** | ~35 GB | ~350k | Everything above, **plus** samtools, bcftools, plink2, GATK4, STAR, Seurat, Bioconductor |\n",
     "\n",
-    "> **Rule of thumb:** pick **minimal** for xQTL runs where you pass in pre-processed data; pick **full** if you'll also do upstream QC, alignment, or single-cell preprocessing.\n",
+    "Choose **minimal** for xQTL runs with pre-processed inputs; choose **full** if you'll also do upstream QC, alignment, or single-cell work.\n",
+    ":::\n",
     "\n",
-    "After the installer finishes:\n",
+    "**Activate and verify:**\n",
     "\n",
     "```bash\n",
-    "source ~/.bashrc      # or ~/.zshrc on macOS\n",
-    "pixi --version        # should print a version number\n",
+    "source ~/.bashrc          # or ~/.zshrc on macOS\n",
+    "pixi --version\n",
     "```\n",
     "\n",
-    "### 2.3 Add SoS on top of pixi (wanggroup.org approach)\n",
+    "You should see a version number. If not, open a fresh terminal.\n",
     "\n",
-    "Pixi gives you Python, R, and JupyterLab, but the xQTL protocol is written as **SoS workflows**, so we add the SoS suite as pixi global packages. This is exactly the Wang Lab convention from [wanggroup.org](https://wanggroup.org/orientation/jupyter-setup.html) \u2014 SoS lives next to pixi's `python` environment and is available from any shell.\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## Step 2 \u2014 Add SoS\n",
+    "\n",
+    "The protocol's pipelines are written as [SoS (Script of Scripts)](https://vatlab.github.io/sos-docs/) workflows. Install the SoS suite into pixi's Python environment:\n",
     "\n",
     "```bash\n",
-    "pixi global install \\\n",
-    "    --environment python \\\n",
-    "    -c conda-forge \\\n",
+    "pixi global install --environment python -c conda-forge \\\n",
     "    sos sos-pbs sos-notebook jupyterlab-sos \\\n",
     "    sos-bash sos-python sos-r\n",
     "\n",
-    "# Register the SoS Jupyter kernel\n",
     "pixi run -e python python -m sos_notebook.install\n",
     "```\n",
     "\n",
-    "Verify the install:\n",
+    "**Verify:**\n",
     "\n",
     "```bash\n",
-    "sos --version               # SoS workflow CLI\n",
-    "jupyter kernelspec list     # should list 'sos' among the kernels\n",
+    "sos --version\n",
+    "jupyter kernelspec list    # should include 'sos'\n",
     "```\n",
     "\n",
-    "> **Why separate from the pixi installer?** The minimal and full pixi-setup bundles intentionally ship a general-purpose Python + R stack. Adding SoS as its own step keeps the base install portable and makes it easy to upgrade SoS independently.\n",
     "\n",
-    "### 2.4 Install Additional Software (optional)\n",
+    "---\n",
     "\n",
-    "Once pixi is configured, installing more tools is a one-liner. Consult [anaconda.org](https://anaconda.org/search) for exact package names.\n",
+    "## Step 3 \u2014 Clone the Protocol\n",
     "\n",
     "```bash\n",
-    "# Bioinformatics CLI tools\n",
-    "pixi global install -c bioconda samtools bcftools plink2\n",
-    "\n",
-    "# R package (into the r-base environment)\n",
-    "pixi global install -c conda-forge --environment r-base r-pacman\n",
-    "\n",
-    "# Python package (into the python environment)\n",
-    "pixi global install -c conda-forge --environment python seaborn\n",
-    "\n",
-    "# Update all packages in an environment\n",
-    "pixi global update r-base\n",
-    "pixi global update python\n",
+    "git clone https://github.com/StatFunGen/xqtl-protocol.git\n",
+    "cd xqtl-protocol\n",
     "```\n",
     "\n",
-    "### 2.5 (Optional) Collaboration-friendly permissions\n",
+    ":::{admonition} What's in the repo?\n",
+    ":class: note\n",
     "\n",
-    "If you share a lab directory, make new files group-writable by default. Add to `~/.bashrc`:\n",
+    "| Folder | Contents |\n",
+    "|---|---|\n",
+    "| `pipeline/` | The SoS workflows you'll run |\n",
+    "| `code/` | Notebook documentation (this page lives here) |\n",
+    "| `data/` | Small example inputs and configuration templates |\n",
+    "| `website/` | JupyterBook sources for the docs site |\n",
+    ":::\n",
     "\n",
-    "```bash\n",
-    "umask 002\n",
-    "```\n",
     "\n",
     "---\n",
     "\n",
-    "## 3. Get the Code\n",
+    "## Step 4 \u2014 Download the Demo Data\n",
     "\n",
-    "Clone the protocol repository \u2014 this is where every pipeline, config, and example lives.\n",
+    "The demo dataset lives on [Synapse](https://www.synapse.org/#!Synapse:syn36416559). Create a free account first, then:\n",
     "\n",
     "```bash\n",
-    "git clone https://github.com/StatFunGen/xqtl-protocol.git\n",
-    "cd xqtl-protocol\n",
+    "pixi global install -c conda-forge --environment python synapseclient\n",
+    "synapse login -p\n",
+    "synapse get -r syn36416559 --downloadLocation data/example/\n",
     "```\n",
     "\n",
-    "### Repository layout\n",
-    "\n",
-    "| Path | What it is |\n",
-    "|---|---|\n",
-    "| `code/` | Notebook-based documentation (this page lives here) |\n",
-    "| `pipeline/` | SoS workflow entry points (symlinks into `code/`) \u2014 this is what you run |\n",
-    "| `website/` | JupyterBook sources for [statfungen.github.io/xqtl-protocol](https://statfungen.github.io/xqtl-protocol/) |\n",
-    "| `data/` | Small example inputs and configuration templates |\n",
-    "| `container/` | Legacy Singularity/Docker recipes (kept for reference) |\n",
-    "\n",
-    "### The pipelines you'll use most\n",
-    "\n",
-    "| Entry point | Purpose |\n",
-    "|---|---|\n",
-    "| `pipeline/1_xqtl_association.ipynb` | End-to-end xQTL association pipeline |\n",
-    "| `pipeline/TensorQTL.ipynb` | TensorQTL cis/trans mapping |\n",
-    "| `pipeline/1_phenotype_preprocessing.ipynb` | Phenotype QC, normalization, covariate correction |\n",
-    "| `pipeline/2_genotype_preprocessing.ipynb` | Genotype QC, imputation, reference alignment |\n",
-    "| `pipeline/4_covariates_preprocessing.ipynb` | PEER / hidden-factor covariate generation |\n",
-    "| `pipeline/eQTL_analysis_commands.ipynb` | Copy-paste command reference for eQTL runs |\n",
-    "| `pipeline/Job_Example.ipynb` | Template for submitting pipelines to an HPC scheduler |\n",
-    "\n",
-    "Browse the full index on the [pipelines page of the website](https://statfungen.github.io/xqtl-protocol/).\n",
     "\n",
     "---\n",
     "\n",
-    "## 4. Run Your First Workflow\n",
+    "## Step 5 \u2014 Run Your First Workflow\n",
     "\n",
-    "Once pixi + SoS are installed and the repo is cloned, confirm SoS can see the workflows:\n",
+    "Confirm SoS can see the pipelines:\n",
     "\n",
     "```bash\n",
-    "cd xqtl-protocol\n",
     "sos run pipeline/1_xqtl_association.ipynb -h\n",
     "```\n",
     "\n",
-    "You should see a list of available workflow steps and options. If you do, you're ready to launch real runs.\n",
-    "\n",
-    "A minimal cis-QTL smoke test (requires demo data \u2014 see next section):\n",
+    "You should see a list of workflow options. Now run a minimal cis-QTL scan:\n",
     "\n",
     "```bash\n",
     "sos run pipeline/TensorQTL.ipynb cis \\\n",
-    "    --genotype-file data/example/genotype.bed \\\n",
-    "    --phenotype-file data/example/phenotype.bed.gz \\\n",
-    "    --covariate-file data/example/covariates.tsv \\\n",
+    "    --genotype-file   data/example/genotype.bed \\\n",
+    "    --phenotype-file  data/example/phenotype.bed.gz \\\n",
+    "    --covariate-file  data/example/covariates.tsv \\\n",
     "    --cwd output/demo_tensorqtl\n",
     "```\n",
     "\n",
-    "> **Tip:** Every pipeline supports `-h` and `--help`. SoS also prints the exact shell commands it runs, which is handy for debugging and for learning what the pipeline does under the hood.\n",
-    "\n",
-    "---\n",
-    "\n",
-    "## 5. Where to Go Next\n",
-    "\n",
-    "**Explore the protocol**\n",
-    "\n",
-    "- [Full documentation site](https://statfungen.github.io/xqtl-protocol/) \u2014 browsable pipeline reference with examples\n",
-    "- [Pipeline index](https://statfungen.github.io/xqtl-protocol/pipeline/) \u2014 each step with inputs, outputs, and parameters\n",
-    "- [HPC quick start (Wang Lab)](https://wanggroup.org/hpc/docs/quick-start/) \u2014 how to request nodes and submit jobs\n",
-    "- [SoS documentation](https://vatlab.github.io/sos-docs/) \u2014 workflow engine reference and tutorials\n",
-    "\n",
-    "**Get help**\n",
+    "Results land in `output/demo_tensorqtl/`.\n",
     "\n",
-    "- [Ask a question (Wang Lab guide)](https://wanggroup.org/orientation/questions.html) \u2014 how to file a good issue\n",
-    "- [GitHub issues](https://github.com/StatFunGen/xqtl-protocol/issues) \u2014 bugs, feature requests, protocol questions\n",
-    "- [pixi-setup issues](https://github.com/StatFunGen/pixi-setup/issues) \u2014 environment / install problems\n",
+    ":::{tip}\n",
+    "Every pipeline supports `-h` and prints the shell commands it runs under the hood \u2014 a great way to learn what's happening and debug failures.\n",
+    ":::\n",
     "\n",
-    "**Contribute**\n",
     "\n",
-    "- Fork [StatFunGen/xqtl-protocol](https://github.com/StatFunGen/xqtl-protocol), make changes on a feature branch, open a PR\n",
-    "- Follow the [reproducible research guide](https://wanggroup.org/orientation/reproducible-research.html) for notebook and commit conventions\n",
+    "---\n",
     "\n",
-    "---\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 6. Analysis Overview\n",
+    "## What to Do Next\n",
     "\n",
-    "The FunGen-xQTL protocol is modular. Each numbered pipeline is a self-contained SoS notebook that can run independently or as part of the full xQTL workflow. At a high level, the flow is:\n",
+    "::::{grid} 1 2 2 2\n",
+    ":gutter: 3\n",
     "\n",
-    "**Preprocess inputs \u2192 Discover QTLs \u2192 Fine-map & integrate \u2192 Report & share**\n",
+    ":::{grid-item-card} \ud83d\udd2c Preprocess your data\n",
+    "`1_phenotype_preprocessing.ipynb`\n",
+    "`2_genotype_preprocessing.ipynb`\n",
+    "`4_covariates_preprocessing.ipynb`\n",
+    ":::\n",
     "\n",
-    "| Stage | Pipelines | What happens |\n",
-    "|---|---|---|\n",
-    "| **1. Phenotype preprocessing** | `1_phenotype_preprocessing.ipynb` | QC, normalization, batch correction for bulk RNA-seq, proteomics, methylation, or single-cell pseudo-bulk |\n",
-    "| **2. Genotype preprocessing** | `2_genotype_preprocessing.ipynb` | Variant-level QC, imputation, ancestry alignment, PCA |\n",
-    "| **3. Covariate preprocessing** | `4_covariates_preprocessing.ipynb` | Known + hidden covariates (PEER, surrogate variables) |\n",
-    "| **4. QTL discovery** | `TensorQTL.ipynb`, `APEX.ipynb`, `1_xqtl_association.ipynb` | Cis/trans scans, interaction QTLs, trans-eQTL screens |\n",
-    "| **5. Fine-mapping & multivariate** | `SuSiE.ipynb`, `mvSuSiE.ipynb`, `fSuSiE.ipynb` | Credible sets across contexts and tissues |\n",
-    "| **6. Colocalization & integration** | `coloc.ipynb`, `cTWAS.ipynb`, `GWAS_integration.ipynb` | Link xQTLs to GWAS and causal gene nomination |\n",
-    "\n",
-    "All pipelines share a common config layout, so once you've learned one you can read the others quickly.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 7. Downloading Example Data\n",
+    ":::{grid-item-card} \ud83e\udded Discover QTLs\n",
+    "`TensorQTL.ipynb`\n",
+    "`1_xqtl_association.ipynb`\n",
+    "`APEX.ipynb`\n",
+    ":::\n",
     "\n",
-    "A small demo dataset lives on [Synapse](https://www.synapse.org/#!Synapse:syn36416559). You'll need a free Synapse account, then install the client (already available via pixi):\n",
+    ":::{grid-item-card} \ud83c\udfaf Fine-map\n",
+    "`SuSiE.ipynb`\n",
+    "`mvSuSiE.ipynb`\n",
+    "`fSuSiE.ipynb`\n",
+    ":::\n",
     "\n",
-    "```bash\n",
-    "# One-time install (uses pixi's python env)\n",
-    "pixi global install -c conda-forge --environment python synapseclient\n",
+    ":::{grid-item-card} \ud83d\udd17 Integrate with GWAS\n",
+    "`coloc.ipynb`\n",
+    "`cTWAS.ipynb`\n",
+    "`GWAS_integration.ipynb`\n",
+    ":::\n",
     "\n",
-    "# Log in (prompts for your Synapse PAT)\n",
-    "synapse login -p\n",
+    "::::\n",
     "\n",
-    "# Download the demo bundle (~2 GB)\n",
-    "synapse get -r syn36416559 --downloadLocation data/example/\n",
-    "```\n",
+    "Full documentation: [statfungen.github.io/xqtl-protocol](https://statfungen.github.io/xqtl-protocol/).\n",
     "\n",
-    "For the full ROSMAP / MSBB production data (controlled access), request access through [AD Knowledge Portal](https://adknowledgeportal.synapse.org/) first.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 8. Troubleshooting\n",
     "\n",
-    "A handful of issues come up often. If one of these matches, try the fix before opening an issue.\n",
+    "---\n",
     "\n",
-    "**`pixi: command not found` after install**\n",
+    "## Troubleshooting\n",
     "\n",
+    ":::{dropdown} `pixi: command not found` after install\n",
+    "Open a new terminal, or re-source your shell rc file:\n",
     "```bash\n",
     "source ~/.bashrc      # Linux / HPC\n",
     "source ~/.zshrc       # macOS\n",
-    "# or open a fresh terminal\n",
     "```\n",
+    ":::\n",
     "\n",
-    "**Installer gets killed on HPC**\n",
-    "Request more memory (\u2265 50 GB) and run on a compute node, not the login node:\n",
-    "\n",
+    ":::{dropdown} Installer killed on HPC\n",
+    "You're running on a login node. Request a compute node with at least 50 GB of memory and re-run the installer:\n",
     "```bash\n",
     "srun --mem=50G --pty bash\n",
     "bash pixi-setup.sh\n",
     "```\n",
+    ":::\n",
     "\n",
-    "**`sos: command not found`**\n",
-    "SoS was not installed on top of pixi. Re-run step 2.3.\n",
-    "\n",
-    "**R library conflicts**\n",
-    "Conda-forge R packages do not mix well with `install.packages()` builds. Prefer `pixi global install --environment r-base r-<pkg>`. If you must use CRAN, stick to pure-R packages.\n",
-    "\n",
-    "**`ModuleNotFoundError` during a pipeline**\n",
-    "Install the missing package into pixi's `python` env:\n",
+    ":::{dropdown} `sos: command not found`\n",
+    "Step 2 didn't complete. Re-run the `pixi global install` command and make sure `jupyter kernelspec list` shows the `sos` kernel.\n",
+    ":::\n",
     "\n",
+    ":::{dropdown} `ModuleNotFoundError` during a pipeline\n",
+    "Install the missing package into pixi's python environment:\n",
     "```bash\n",
     "pixi global install -c conda-forge --environment python <package>\n",
     "```\n",
+    ":::\n",
     "\n",
-    "**GitHub is unreachable from your network**\n",
-    "Use the Gitee mirror documented at [wanggroup.org/hpc/docs/software-setup-conda](https://wanggroup.org/hpc/docs/software-setup-conda) (see \"Users in China\").\n",
-    "\n",
-    "**File permissions on shared directories**\n",
-    "Add `umask 002` to your `~/.bashrc` so new files are group-writable.\n",
-    "\n",
-    "---\n",
-    "\n",
-    "## 9. Running on HPC\n",
+    ":::{dropdown} R package conflicts or install failures\n",
+    "Prefer conda-forge R packages over `install.packages()`:\n",
+    "```bash\n",
+    "pixi global install --environment r-base r-<pkg>\n",
+    "```\n",
+    "Mixing CRAN builds with conda R leads to ABI mismatches \u2014 avoid it.\n",
+    ":::\n",
     "\n",
-    "The protocol plays nicely with SoS's built-in task queue support for **SLURM**, **LSF**, **SGE**, and **PBS/Torque**. See `pipeline/Job_Example.ipynb` for a template and the [SoS task queue docs](https://vatlab.github.io/sos-docs/doc/user_guide/task_statement.html) for configuration.\n",
+    ":::{dropdown} Still stuck?\n",
+    "[Open an issue](https://github.com/StatFunGen/xqtl-protocol/issues) with the command you ran and the full error output.\n",
+    ":::\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Analysis Overview\n",
     "\n",
-    "A typical SLURM submission looks like:\n",
+    "The protocol is modular. Each numbered pipeline is a self-contained SoS notebook that can run independently or chained together.\n",
     "\n",
-    "```bash\n",
-    "sos run pipeline/TensorQTL.ipynb cis \\\n",
-    "    --genotype-file ... --phenotype-file ... --covariate-file ... \\\n",
-    "    --cwd output/run01 \\\n",
-    "    -q slurm -c config/hpc_slurm.yml \\\n",
-    "    -J 20            # up to 20 concurrent jobs\n",
-    "```\n",
+    "::::{grid} 1 2 2 2\n",
+    ":gutter: 3\n",
     "\n",
-    "The `config/hpc_slurm.yml` file controls partitions, walltime, and memory per task. Start from the template in the repo and adapt it to your cluster.\n",
+    ":::{grid-item-card} 1. Preprocess\n",
+    "`1_phenotype_preprocessing.ipynb` \u2014 QC, normalization\n",
+    "`2_genotype_preprocessing.ipynb` \u2014 variant QC, imputation\n",
+    "`4_covariates_preprocessing.ipynb` \u2014 PEER / hidden factors\n",
+    ":::\n",
     "\n",
-    "---\n",
+    ":::{grid-item-card} 2. Discover\n",
+    "`TensorQTL.ipynb` \u2014 cis/trans scans\n",
+    "`APEX.ipynb` \u2014 interaction QTLs\n",
+    "`1_xqtl_association.ipynb` \u2014 end-to-end wrapper\n",
+    ":::\n",
     "\n",
-    "## 10. Citing the Protocol\n",
+    ":::{grid-item-card} 3. Fine-map\n",
+    "`SuSiE.ipynb` \u2014 single-context credible sets\n",
+    "`mvSuSiE.ipynb` \u2014 multi-context\n",
+    "`fSuSiE.ipynb` \u2014 functional annotations\n",
+    ":::\n",
     "\n",
-    "If you use the FunGen-xQTL protocol in a publication, please cite:\n",
+    ":::{grid-item-card} 4. Integrate\n",
+    "`coloc.ipynb` \u2014 colocalization with GWAS\n",
+    "`cTWAS.ipynb` \u2014 causal TWAS\n",
+    "`GWAS_integration.ipynb` \u2014 joint reporting\n",
+    ":::\n",
     "\n",
-    "> Cao *et al.*, *A computational protocol for molecular QTL analysis integrating GWAS.* See [CITATION.md](https://github.com/StatFunGen/xqtl-protocol/blob/main/CITATION.md) in the repository for the current preferred citation and BibTeX entry.\n",
+    "::::\n",
     "\n",
-    "And drop us a line \u2014 we love hearing how the protocol is being used.\n"
+    "All pipelines share a common config layout, so once you know one you can read the rest.\n"
    ]
   },
   {