From ce19699c8b54189e9d52b4d9407983ec06436207 Mon Sep 17 00:00:00 2001
From: Maxime Grenu <maxime.grenu@gmail.com>
Date: Wed, 18 Mar 2026 12:24:36 +0100
Subject: [PATCH 1/4] docs: update spark-install guide with real-world findings

- Remove WIP notice (content is complete and tested)
- Add one-command install via nvidia.com/nemoclaw.sh
- Clarify Spark hardware details (aarch64, Grace + GB10)
- Add known issues from real deployments: pip system packages,
  port 3000 conflict with AI Workbench, network policy for
  NVIDIA cloud API
- Add section on running local LLMs with llama.cpp on GB10
- Note that NIM containers are amd64-only on Spark
- Update architecture diagram with unified memory and local
  inference

Signed-off-by: Maxime Grenu <maxime@cluster2600.com>
Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
---
 spark-install.md | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/spark-install.md b/spark-install.md
index 7974cba04..7c9006b00 100644
--- a/spark-install.md
+++ b/spark-install.md
@@ -12,10 +12,10 @@
 ## Quick Start
 
 ```bash
-# Install OpenShell:
-curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh
+# One-command install
+curl -fsSL https://nvidia.com/nemoclaw.sh | sudo bash
 
-# Clone NemoClaw:
+# Or clone and install manually
 git clone https://github.com/NVIDIA/NemoClaw.git
 cd NemoClaw
 
@@ -31,7 +31,7 @@ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
 
 ## What's Different on Spark
 
-DGX Spark ships **Ubuntu 24.04 + Docker 28.x** but no k8s/k3s. OpenShell embeds k3s inside a Docker container, which hits two problems on Spark:
+DGX Spark ships **Ubuntu 24.04 (Noble) + Docker 28.x/29.x** on **aarch64 (Grace CPU + GB10 GPU)** but no k8s/k3s. OpenShell embeds k3s inside a Docker container, which hits two problems on Spark:
 
 ### 1. Docker permissions
 
@@ -99,6 +99,9 @@ nemoclaw onboard
 | CoreDNS CrashLoop after setup | Fixed in `fix-coredns.sh` | Uses container gateway IP, not 127.0.0.11 |
 | Image pull failure (k3s can't find built image) | OpenShell bug | `openshell gateway destroy && openshell gateway start`, re-run setup |
 | GPU passthrough | Untested on Spark | Should work with `--gpu` flag if NVIDIA Container Toolkit is configured |
+| `pip install` fails with system packages | Known | Use `--break-system-packages` or a venv for Python packages inside the sandbox |
+| Port 3000 conflict with AI Workbench | Known | AI Workbench Traefik proxy uses port 3000 (and 10000); use a different port for other services |
+| Network policy blocks NVIDIA cloud API | By design | Ensure `integrate.api.nvidia.com` is in the sandbox network policy if using cloud inference |
 
 ## Verifying Your Install
 
@@ -116,13 +119,33 @@ nemoclaw-start openclaw agent --agent main --local -m 'hello' --session-id test
 openshell term
 ```
 
+## Using Local LLMs
+
+DGX Spark has 128 GB unified memory shared between CPU and GPU. You can run local models alongside the sandbox:
+
+```bash
+# Build llama.cpp for GB10 (sm_121)
+git clone https://github.com/ggml-org/llama.cpp.git
+cd llama.cpp
+PATH=/usr/local/cuda/bin:$PATH cmake -B build -DGGML_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=121
+cmake --build build --config Release -j$(nproc)
+
+# Run a model (e.g. Nemotron-3-Super-120B Q4_K_M ~78 GB)
+./build/bin/llama-server --model <path-to-gguf> --host 0.0.0.0 --port 8000 \
+  --n-gpu-layers 999 --ctx-size 32768
+```
+
+Then configure your sandbox to use the local model by updating `~/.openclaw/openclaw.json` inside the sandbox with the local provider URL.
+
+> **Note**: NIM containers for most models are amd64-only and will not run on the Spark's aarch64 architecture. Use GGUF models with llama.cpp instead.
+
 ## Architecture Notes
 
 ```text
-DGX Spark (Ubuntu 24.04, cgroup v2)
-  └── Docker (28.x, cgroupns=host)
-       └── OpenShell gateway container
-            └── k3s (embedded)
-                 └── nemoclaw sandbox pod
-                      └── OpenClaw agent + NemoClaw plugin
+DGX Spark (Ubuntu 24.04, aarch64, cgroup v2, 128 GB unified memory)
+  └── Docker (28.x/29.x, cgroupns=host)
+  │    └── OpenShell gateway container (k3s embedded)
+  │         └── nemoclaw sandbox pod
+  │              └── OpenClaw agent + NemoClaw plugin
+  └── llama-server (optional, local inference on GB10 GPU)
 ```

From 9d629a2dc6f7b31f4a10a669632e4afbba7e0392 Mon Sep 17 00:00:00 2001
From: Maxime Grenu <maxime.grenu@gmail.com>
Date: Wed, 18 Mar 2026 13:25:33 +0100
Subject: [PATCH 2/4] docs: correct NIM arm64 compatibility statement

Not all NIM containers are amd64-only. Some models (e.g.,
Nemotron-3-Super-120B-A12B) ship native arm64 images that run
on DGX Spark. Clarify the note to reflect this and recommend
checking image architecture before pulling.

Signed-off-by: Maxime Grenu <maxime@cluster2600.com>
Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
---
 spark-install.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/spark-install.md b/spark-install.md
index 7c9006b00..a56b6eda0 100644
--- a/spark-install.md
+++ b/spark-install.md
@@ -137,7 +137,7 @@ cmake --build build --config Release -j$(nproc)
 
 Then configure your sandbox to use the local model by updating `~/.openclaw/openclaw.json` inside the sandbox with the local provider URL.
 
-> **Note**: NIM containers for most models are amd64-only and will not run on the Spark's aarch64 architecture. Use GGUF models with llama.cpp instead.
+> **Note**: Some NIM containers (e.g., Nemotron-3-Super-120B-A12B) ship native arm64 images and run on the Spark. However, many NIM images are amd64-only and will fail with `exec format error`. Check the image architecture before pulling. GGUF models with llama.cpp are a reliable alternative for models without arm64 NIM support.
 
 ## Architecture Notes
 

From 7c87489e63d358ac4cac01555f6ed37efa97a120 Mon Sep 17 00:00:00 2001
From: Maxime Grenu <maxime.grenu@gmail.com>
Date: Wed, 18 Mar 2026 14:44:44 +0100
Subject: [PATCH 3/4] docs: address review feedback on spark-install

- Prioritise venv over --break-system-packages in known issues
- Add explicit JSON example for local provider config
- Add note about egress proxy and inference.local
- Add language identifier to architecture code block (MD040)

Signed-off-by: Maxime Grenu <maxime@cluster2600.com>
Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
---
 spark-install.md | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/spark-install.md b/spark-install.md
index a56b6eda0..5d3ff6f73 100644
--- a/spark-install.md
+++ b/spark-install.md
@@ -99,7 +99,7 @@ nemoclaw onboard
 | CoreDNS CrashLoop after setup | Fixed in `fix-coredns.sh` | Uses container gateway IP, not 127.0.0.11 |
 | Image pull failure (k3s can't find built image) | OpenShell bug | `openshell gateway destroy && openshell gateway start`, re-run setup |
 | GPU passthrough | Untested on Spark | Should work with `--gpu` flag if NVIDIA Container Toolkit is configured |
-| `pip install` fails with system packages | Known | Use `--break-system-packages` or a venv for Python packages inside the sandbox |
+| `pip install` fails with system packages | Known | Use a venv (recommended) or `--break-system-packages` (last resort, can break system tools) |
 | Port 3000 conflict with AI Workbench | Known | AI Workbench Traefik proxy uses port 3000 (and 10000); use a different port for other services |
 | Network policy blocks NVIDIA cloud API | By design | Ensure `integrate.api.nvidia.com` is in the sandbox network policy if using cloud inference |
 
@@ -135,7 +135,27 @@ cmake --build build --config Release -j$(nproc)
   --n-gpu-layers 999 --ctx-size 32768
 ```
 
-Then configure your sandbox to use the local model by updating `~/.openclaw/openclaw.json` inside the sandbox with the local provider URL.
+Then configure your sandbox to use the local model by updating `~/.openclaw/openclaw.json` inside the sandbox:
+
+```json
+{
+  "models": {
+    "providers": {
+      "local": {
+        "baseUrl": "http://host.containers.internal:8000/v1",
+        "apiKey": "not-needed",
+        "api": "openai-completions",
+        "models": [{ "id": "my-model", "name": "Local Model" }]
+      }
+    }
+  },
+  "agents": {
+    "defaults": { "model": { "primary": "local/my-model" } }
+  }
+}
+```
+
+> **Note**: The sandbox egress proxy blocks direct access to the host network. Use `inference.local` with `"apiKey": "openshell-managed"` if your model is configured via NIM or `nemoclaw setup-spark`.
 
 > **Note**: Some NIM containers (e.g., Nemotron-3-Super-120B-A12B) ship native arm64 images and run on the Spark. However, many NIM images are amd64-only and will fail with `exec format error`. Check the image architecture before pulling. GGUF models with llama.cpp are a reliable alternative for models without arm64 NIM support.
 

From c7d7cffa919cea97d165559f125410698f4c204b Mon Sep 17 00:00:00 2001
From: Maxime Grenu <maxime.grenu@gmail.com>
Date: Wed, 18 Mar 2026 18:56:11 +0100
Subject: [PATCH 4/4] docs: address review feedback and add dashboard section

- Replace pipe-to-shell with download-then-review install flow
- Add Web Dashboard section with OpenClaw Control UI instructions
- Document 127.0.0.1 vs localhost origin requirement
- Note external dashboard limitation with link to upstream issue

Signed-off-by: Maxime Grenu <maxime@cluster2600.com>
Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
---
 spark-install.md | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/spark-install.md b/spark-install.md
index 5d3ff6f73..e5b0d90da 100644
--- a/spark-install.md
+++ b/spark-install.md
@@ -12,8 +12,10 @@
 ## Quick Start
 
 ```bash
-# One-command install
-curl -fsSL https://nvidia.com/nemoclaw.sh | sudo bash
+# Download and review the installer before running
+curl -fsSL https://nvidia.com/nemoclaw.sh -o nemoclaw-install.sh
+less nemoclaw-install.sh  # review the script
+sudo bash nemoclaw-install.sh
 
 # Or clone and install manually
 git clone https://github.com/NVIDIA/NemoClaw.git
@@ -119,6 +121,18 @@ nemoclaw-start openclaw agent --agent main --local -m 'hello' --session-id test
 openshell term
 ```
 
+## Web Dashboard
+
+The OpenClaw gateway includes a built-in web UI. Access it at:
+
+```
+http://127.0.0.1:18789/#token=<your-gateway-token>
+```
+
+Find your gateway token in `~/.openclaw/openclaw.json` under `gateway.auth.token` inside the sandbox.
+
+> **Important**: Use `127.0.0.1` (not `localhost`) — the gateway's origin check requires an exact match. External dashboards like Mission Control cannot currently connect due to the gateway resetting `controlUi.allowedOrigins` on every config reload (see [openclaw#49950](https://github.com/openclaw/openclaw/issues/49950)).
+
 ## Using Local LLMs
 
 DGX Spark has 128 GB unified memory shared between CPU and GPU. You can run local models alongside the sandbox: