Skip to content

docs: add known issues, hardware details, and NIM warnings to spark-install guide#885

Merged
cv merged 2 commits intoNVIDIA:mainfrom
cluster2600:docs/spark-known-issues
Mar 26, 2026
Merged

docs: add known issues, hardware details, and NIM warnings to spark-install guide#885
cv merged 2 commits intoNVIDIA:mainfrom
cluster2600:docs/spark-known-issues

Conversation

@cluster2600
Copy link
Copy Markdown
Contributor

@cluster2600 cluster2600 commented Mar 25, 2026

Summary

Cherry-picks the documentation improvements from #304 that reviewers confirmed are worth merging:

  • Three known issues: pip system packages error, port 3000 AI Workbench conflict, NVIDIA cloud API egress
  • Web Dashboard access section
  • NIM arm64 compatibility warning
  • Hardware details (aarch64, 128 GB unified memory, Docker 29.x)

Excludes the install-flow change (handled in #696) and manual openclaw.json editing (needs validation against current gateway config flow).

Supersedes #304.

Test plan

  • Documentation renders correctly
  • No broken links

Summary by CodeRabbit

  • Documentation
    • Updated DGX Spark hardware specifications and Docker version details.
    • Extended Known Issues section with troubleshooting for pip install failures, port 3000 conflicts, and network policy configurations.
    • Added Technical Reference section with OpenClaw gateway setup guidance.
    • Added NIM Compatibility notes for arm64 architecture limitations.
    • Updated architecture diagram to reflect current specifications.

…nstall guide

Cherry-pick documentation improvements confirmed by reviewers in NVIDIA#304:
- Three additional known issues (pip system packages, port 3000 AI Workbench
  conflict, NVIDIA cloud API egress)
- Web Dashboard access section
- NIM arm64 compatibility warning
- Hardware details (aarch64, 128 GB unified memory, Docker 29.x)

Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
Copy link
Copy Markdown
Contributor

@prekshivyas prekshivyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good additions — the new known issues (pip, port 3000, network policy), web dashboard section, and NIM arm64 compatibility warning are all practical, hardware-tested findings.

This needs a rebase on main since #857 (structural restructure of spark-install.md) was just merged. The new sections should slot into #857's structure:

  • Known issues rows → Troubleshooting section
  • Web dashboard + NIM arm64 → Technical Reference section

Slot PR NVIDIA#885 content into NVIDIA#857 structure per reviewer:
known-issue rows to Troubleshooting, Web Dashboard and
NIM arm64 to Technical Reference, hardware details to
Prerequisites and Architecture diagram.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Updated spark-install.md to specify Docker versions 28.x/29.x, document DGX Spark hardware specifications (Ubuntu 24.04, aarch64, Grace CPU + GB10 GPU, 128 GB unified memory), and add new sections covering known issues, technical reference guidance, and NIM arm64 compatibility constraints.

Changes

Cohort / File(s) Summary
Documentation Updates
spark-install.md
Added Docker version specifications, extended Known Issues table with pip install failures, port 3000 conflicts, and network policy blocking; introduced Technical Reference section with OpenClaw gateway details and explicit IP guidance; added NIM Compatibility on arm64 section documenting amd64-only image failures and alternatives.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Poem

🐰 A doc update hops along,
With Docker specs and fixes strong,
DGX Spark shines in noble light,
arm64 paths made crystal bright! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title directly and accurately reflects the main changes: adding known issues, hardware details, and NIM compatibility warnings to the spark-install documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@spark-install.md`:
- Around line 211-213: The doc currently references "~/.openclaw/openclaw.json
inside the sandbox" without showing how to fetch the gateway token; update the
text to explicitly tell users how to retrieve gateway.auth.token by (a) running
jq against /sandbox/.openclaw/openclaw.json from inside the sandbox to extract
gateway.auth.token, or (b) from the host using the sandbox download command
(e.g. openshell sandbox download) to fetch /sandbox/.openclaw/openclaw.json and
then extract gateway.auth.token; retain the note to use 127.0.0.1 for origin
checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fe549f0b-7f5b-49b1-9042-34666955a8a7

📥 Commits

Reviewing files that changed from the base of the PR and between 2804eae and c1cb961.

📒 Files selected for processing (1)
  • spark-install.md

Comment on lines +211 to +213
Find your gateway token in `~/.openclaw/openclaw.json` under `gateway.auth.token` inside the sandbox.

> **Important**: Use `127.0.0.1` (not `localhost`) — the gateway's origin check requires an exact match. External dashboards like Mission Control cannot currently connect due to the gateway resetting `controlUi.allowedOrigins` on every config reload (see [openclaw#49950](https://github.com/openclaw/openclaw/issues/49950)).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify how token/config location is documented elsewhere in the repo.
# Expected: consistent guidance for openclaw.json location and token retrieval context.

rg -n -C2 'openclaw\.json|gateway\.auth\.token|127\.0\.0\.1:18789|Web Dashboard|Mission Control'

Repository: NVIDIA/NemoClaw

Length of output: 46323


Specify the command or method to retrieve the gateway token for host users.

Line 211 mixes host path notation (~/.openclaw/) with "inside the sandbox," leaving users unclear on how to actually obtain the token for the Web Dashboard URL. Replace the vague reference with an explicit retrieval method. For example:

To retrieve your gateway token, run this command inside the sandbox:
jq -r '.gateway.auth.token' /sandbox/.openclaw/openclaw.json

Or use openshell sandbox download /sandbox/.openclaw/openclaw.json from the host if the sandbox supports it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@spark-install.md` around lines 211 - 213, The doc currently references
"~/.openclaw/openclaw.json inside the sandbox" without showing how to fetch the
gateway token; update the text to explicitly tell users how to retrieve
gateway.auth.token by (a) running jq against /sandbox/.openclaw/openclaw.json
from inside the sandbox to extract gateway.auth.token, or (b) from the host
using the sandbox download command (e.g. openshell sandbox download) to fetch
/sandbox/.openclaw/openclaw.json and then extract gateway.auth.token; retain the
note to use 127.0.0.1 for origin checks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — updated to show the explicit jq command for extracting the token from inside the sandbox. The ~/.openclaw/ path was misleading since the config lives at /sandbox/.openclaw/openclaw.json inside the container.

@cv cv merged commit a1b86a5 into NVIDIA:main Mar 26, 2026
6 of 7 checks passed
@cluster2600 cluster2600 deleted the docs/spark-known-issues branch March 27, 2026 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants