diff --git a/.gitignore b/.gitignore index 3ea6ac4..d20401b 100644 --- a/.gitignore +++ b/.gitignore @@ -8,6 +8,14 @@ build/ .eggs/ *.so .pytest_cache/ +.pytest_tmp/ +pytest_tmp*/ +.tmp-pytest*/ +.pytest-*/ +.codex-pytest-tmp/ +.codex-test-tmp*/ +pytest-of-*/ +tmp*/ .mypy_cache/ .ruff_cache/ *.db @@ -32,3 +40,4 @@ docs/DEVTO_ARTICLE.md docs/DEVTO_FINAL.md benchmark_results.md .coverage +/setup-*.png diff --git a/CHANGELOG.md b/CHANGELOG.md index 4388ebb..510efa4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,49 @@ # Changelog +## [0.34.1] - 2026-04-28 + +### Added +- Music Producer Studio mission and `nvh studio --install music -y` bundle + with ACE-Step, Demucs/WhisperX/audio lab tooling, and Audacity/LMMS + AppImage helpers. +- Setup wizard mission cards for AI Starter, Graphics Creator Studio, Game + Dev Lab, Music Producer Studio, Agent Builder, Local LLM Lab, and Power + User Workstation. +- GPU detection diagnostics that distinguish CPU-only hosts from rootless + sessions where NVIDIA devices exist but NVML or `nvidia-smi` are blocked. +- Boot preflight tracking for GPU architecture, compute capability, framebuffer + memory, Node/npm versions, storage capacity, storage write probes, and the + selected PyTorch CUDA profile. + +### Changed +- Setup model recommendations now use the same GPU inventory path as the + system API, including aggregate multi-GPU framebuffer totals. +- Persistent storage checks now perform a real write/fsync/delete probe instead + of relying only on `os.access`. +- The setup wizard surfaces real software icons and shorter mission-first + language to reduce first-run wall-of-text fatigue. +- Setup wizard mission cards stay mission-first while storage, GPU, catalog, + and compatibility scans continue in the background. +- `nvh webui` now builds and starts the optimized production WebUI by default, + with `--dev` reserved for contributors editing the frontend. +- Pip/binary WebUI bootstrap now falls back to downloading the GitHub source + archive when `git` is missing, so fresh rootless desktops have one less + prerequisite. +- WebUI bootstrap now prefers the installed release tag before falling back to + `main`, keeping PyPI/binary installs aligned with their shipped version. +- `nvh workstation --all -y` now passes the non-interactive yes flag through to + the rootless Node installer instead of pausing for confirmation. + +### Fixed +- Sorted the generated fallback catalog import block so the Python CI lint + matrix can pass. +- Model fit reports now check the total recommended model queue against + available persistent storage, not only one model at a time. +- WebUI API health and storage preflight now retry during slow API startup + instead of leaving the setup page stuck offline until a manual reload. +- API CORS now allows localhost, 127.0.0.1, nvhive, and IPv6 loopback WebUI + fallback ports so dynamic local previews can reach the API. + ## [0.34.0] - 2026-04-28 ### Added diff --git a/README.md b/README.md index b418e5c..70a413d 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,28 @@ # nvHive -**One command. Every AI model you have. Automatically assembled into the best team for each task.** +**A rootless NVIDIA AI lab for students, creators, agents, ComfyUI, and local models.** -![version](https://img.shields.io/badge/version-0.34.0-blue) ![python](https://img.shields.io/badge/python-3.11%2B-blue) ![license](https://img.shields.io/badge/license-MIT-green) ![ci](https://img.shields.io/badge/CI-Linux%20%7C%20Windows%20%7C%20macOS-blue) +![version](https://img.shields.io/badge/version-0.34.1-blue) ![python](https://img.shields.io/badge/python-3.11%2B-blue) ![license](https://img.shields.io/badge/license-MIT-green) ![ci](https://img.shields.io/badge/CI-Linux%20%7C%20Windows%20%7C%20macOS-blue) -```bash -nvh "What is a binary search tree?" # → answers (single best advisor) -nvh "Fix the timeout bug in council.py" # → auto-detects coding task → agent mode -nvh "Should we use Redis or Postgres?" # → auto-detects debate → council (3+ advisors) -nvh "take a screenshot and describe my desktop" # → desktop agent (vision + tools) -nvh "setup comfyui" # → agent installs, configures, launches -``` +nvHive turns a fresh cloud Linux GPU desktop into a ready-to-use AI workstation +without `sudo`: it finds persistent storage, installs into user-owned paths, +opens a setup wizard, recommends models for the detected GPU, and gives students +one-click paths for local LLMs, ComfyUI, agents, creative tools, game-dev tools, +and music production. + +What you get on the happy path: + +- A desktop launcher and WebUI setup wizard. +- Persistent `NVH_HOME` storage for models, ComfyUI, apps, logs, jobs, and config. +- GPU-aware model recommendations and disk estimates. +- Mission cards for AI Starter, Graphics Creator, Game Dev, Music Producer, and Agent Builder. +- Self-healing checks for storage, Python, Node, CUDA, drivers, boot drift, and install receipts. +- Redacted error reports with request IDs when something needs debugging. + +Release status: CI is green across Linux, Windows, and macOS. nvHive should be +treated as a production candidate until the +[target NVIDIA Linux VM checklist](docs/PRODUCTION_READINESS.md) passes on the +actual no-root GPU desktop.

nvHive CLI @@ -20,9 +32,39 @@ nvh "setup comfyui" # → agent installs, configure ## Install -Three ways to get nvHive — pick the one that matches your setup. No Docker, no container runtime, no root required. +Start with the Linux GPU desktop path if you are on a cloud workstation or +GeForce NOW-style session. Other install paths are below for existing Python +environments, local laptops, and single-file binary installs. No Docker, no +container runtime, and no root access are required for nvHive itself. + +For cloud desktops, large downloads should live on the persistent block-backed +mount, not the read-only OS disk. The launcher finds the best candidate +automatically. If you already know the mount path, set `NVH_HOME` first: -### Option 1 — One-line installer (recommended for GPU VMs) +```bash +export NVH_HOME=/mnt/persist/nvhive +``` + +### Recommended - Launch a Linux GPU desktop lab + +This is the easiest path for cloud Linux GPU sessions where only a mounted file +volume survives reconnects: + +```bash +curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/start-linux.sh | bash +``` + +The launcher auto-detects a likely persistent block-backed mount, sets +`NVH_HOME`, installs nvHive rootlessly if needed, creates the desktop launcher, +starts the API/WebUI, and opens the setup wizard. If Python is missing, set +`NVH_USE_BINARY=1` and the same launcher downloads the single-file Linux binary +instead of creating a venv. + +```bash +curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/start-linux.sh | NVH_USE_BINARY=1 bash +``` + +### General Linux installer ```bash curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install.sh | bash @@ -30,12 +72,14 @@ curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install.sh Works on any Linux box with no root. Installs to `NVH_HOME` when set, otherwise `~/.nvh/` for new installs, uses Python `venv` + `pip` by default, offers a rootless micromamba fallback only when the cloud image needs it, pulls Ollama if you have an NVIDIA GPU, and writes a sensible default config. +If `NVH_HOME` is not set, the installer now checks common persistent mount roots such as `/mnt`, `/media/$USER`, `/workspace`, `/data`, `/persistent`, and `/storage` before falling back to `~/.nvh`. + Windows: `iwr -useb https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install.ps1 | iex` macOS: `curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install-mac.sh | bash` -### Option 2 — Single-file binary (no Python needed) +### Single-file binary (no Python needed) -Fully standalone. No Python install, no pip, no venv. Click your OS: +Fully standalone. No Python install, no pip, no venv. Download the asset, make it executable, and run `nvh workstation --launch`. Click your OS:

@@ -51,9 +95,21 @@ Fully standalone. No Python install, no pip, no venv. Click your OS:

-On Linux/macOS after download: `chmod +x nvh-* && ./nvh-*`. Full asset list (wheel, sdist, checksums) lives on the [Releases page](https://github.com/thatcooperguy/nvHive/releases/latest). +Linux terminal path: + +```bash +mkdir -p "$HOME/.local/bin" +curl -fL https://github.com/thatcooperguy/nvHive/releases/latest/download/nvh-linux-x86_64 -o "$HOME/.local/bin/nvh" +chmod +x "$HOME/.local/bin/nvh" +NVH_HOME=/mnt/persist/nvhive "$HOME/.local/bin/nvh" workstation --launch -y +``` -### Option 3 — pip from PyPI (for existing Python environments) +On Linux/macOS after a browser download: +`chmod +x nvh-* && NVH_HOME=/mnt/persist/nvhive ./nvh-* workstation --launch -y`. +Full asset list (wheel, sdist, checksums) lives on the +[Releases page](https://github.com/thatcooperguy/nvHive/releases/latest). + +### pip from PyPI (for existing Python environments) ```bash pip install nvhive # core @@ -62,14 +118,17 @@ pip install "nvhive[browser]" # + headless browser (playwright) pip install "nvhive[all]" # everything ``` +`pip install` installs the Python package only. For large local models, +ComfyUI, and Studio packs, launch the workstation with a persistent `NVH_HOME` +so assets survive reconnects. + ### First run ```bash -nvh # guided setup — GPU detect, provider keys, local model pulls +nvh # guided setup: GPU detect, provider keys, local model pulls nvh workstation --all -y # Linux GPU desktop: launcher + WebUI + ComfyUI + studio packs -nvh webui # Setup > Models lets you choose exact local downloads -nvh studio --install starter -y # rootless LLMs + agents + ComfyUI nodes + game-dev tools -nvh "your question" # just ask — nvHive figures out the rest +nvh webui # open the dashboard and setup wizard +nvh "your question" # just ask - nvHive figures out the rest ``` For a fresh Linux cloud desktop where only a mounted file volume persists, @@ -82,7 +141,7 @@ source "$NVH_HOME/nvh-env.sh" nvh workstation --home-dir "$NVH_HOME" --all -y ``` -`nvh workstation --all -y` creates a desktop launcher, starts the WebUI, prepares rootless local model tooling, installs ComfyUI with nvHive starter workflow examples, and adds AI Studio packs for LLMs, agents, ComfyUI nodes, and Linux game projects. +`nvh workstation --all -y` creates a desktop launcher, starts the WebUI, prepares rootless local model tooling, installs ComfyUI with nvHive starter workflow examples, and adds the beginner AI Starter packs. Creative, game, Claw, and music missions stay one click away in the WebUI or can be installed directly with `nvh studio --install creative|game|claw|music -y`. Use packs directly when you want a specific no-root lab: @@ -92,27 +151,73 @@ nvh studio --models nvh studio --install-models recommended -y nvh studio --install llms -y nvh studio --install agents -y +nvh studio --install claw -y # OpenClaw + NemoClaw when Docker/OpenShell is usable nvh studio --install comfy -y nvh studio --install game -y nvh studio --install creative -y # Blender LTS + game/asset workspace +nvh studio --install music -y # ACE-Step, Demucs, WhisperX, Audacity/LMMS AppImages nvh studio --install python-runtime-fallback -y # optional rescue pack, not the default path ``` -The WebUI setup wizard includes a model picker with GPU-fit badges, disk -estimates, installed status, and persistent install jobs saved under -`$NVH_HOME/jobs`. ComfyUI, AI Studio packs, and local model downloads keep -showing progress after a browser refresh and can be canceled from the wizard. -For pip installs, the downloaded WebUI, npm cache, and no-root Node fallback -also live under `$NVH_HOME`, so reconnecting to a cloud desktop does not mean -starting the frontend toolchain from scratch. -The wizard now includes a local setup helper that ranks the next storage, -runtime, model, ComfyUI, and creative-tool actions before any local LLM is -installed. -The ComfyUI step lets students select workflow examples and save a model -download plan with source links, folder targets, and a helper checklist script, -because many image/video weights are large or require upstream terms. - -On first run, `nvh` launches a guided 3-step setup — GPU detection, provider keys, local model pulls. Works immediately with local models (no signup needed). Every step is skippable. Run `nvh setup` anytime to reconfigure. +### Pick Your Mission + +The setup wizard starts with one simple question: **what do you want to make?** +Pick a mission and nvWizard handles storage, GPU checks, Python/Node runtime +checks, model recommendations, rootless installers, and background jobs. + +The wizard does six things before heavy installs: + +1. Finds the best user-writable persistent storage path. +2. Checks GPU, driver, CUDA, VRAM, Python, Node, and npm health. +3. Recommends local models that fit the detected hardware and disk. +4. Installs selected tools into `NVH_HOME` without root. +5. Tracks long downloads as resumable jobs with logs and receipts. +6. Offers **Fix My Setup** and **Copy Error Report** when something breaks. + +| Mission | What it sets up | +| --- | --- | +| AI Starter | Local chat and coding models, Ollama, GitHub helper, and a local agent helper | +| Graphics Creator Studio | ComfyUI, Blender, image/video workflow examples, and model download plans | +| Game Dev Lab | Godot helpers, Blender assets, GitHub, Linux game tooling, and Unity/Unreal guidance | +| Music Producer Studio | AI music generation, stem splitting, transcription, audio apps, and notebooks | +| Agent Builder | OpenClaw by default, with NemoClaw unlocked only when Docker already works without sudo | + +Beginner Mode shows one recommended action, a **Fix My Setup** repair path, and +mission cards. Advanced Details stays available for storage, driver, CUDA, +Python, Node, logs, receipts, boot drift, and release-readiness diagnostics. + +ComfyUI, AI Studio packs, and local model downloads run as persistent jobs under +`$NVH_HOME/jobs`, so setup progress survives browser refreshes and cloud desktop +reconnects. The model picker shows GPU-fit badges, disk estimates, installed +status, and a selected download queue. + +When the VM image changes between sessions, nvWizard compares the new boot +fingerprint with `$NVH_HOME/config/boot-preflight.json` and recommends rootless +repairs before launching large installs. If something still fails, **Copy Error +Report** creates a redacted report with request IDs and log locations. See +[Production Readiness](docs/PRODUCTION_READINESS.md) for the release gates and +target NVIDIA Linux VM acceptance checklist. + +If setup gets stuck: + +```bash +nvh webui # reopen the wizard +nvh doctor --storage --home-dir "$NVH_HOME" # verify the persistent mount +nvh doctor --fix # try safe local repairs +tail -n 80 "$NVH_HOME/logs/nvhive.log" # inspect rootless logs +``` + +When the local API is running, Advanced Details can copy the same redacted report +from the UI, or you can call `GET /v1/setup/diagnostics` directly. + +

+ Rootless NVIDIA cloud desktop layout +

+ +On first run, `nvh` launches a guided setup helper for GPU detection, +provider keys, local model pulls, and rootless app installs. Works immediately +with local models when available. Every advanced step is skippable. Run +`nvh setup` anytime to reconfigure.

nvHive 3-Step Setup Flow @@ -120,13 +225,17 @@ On first run, `nvh` launches a guided 3-step setup — GPU detection, provider k ### WebUI -`nvh webui` launches a full-screen dashboard at `localhost:3000` — chat, council mode, advisor status, analytics, system stats, and setup flows. NVIDIA corporate theme, keyboard-first (Ctrl+K command palette, Ctrl+B collapse sidebar). +`nvh webui` launches the local dashboard at `localhost:3000`: setup wizard, +chat, council mode, advisor status, analytics, and system health. First run +installs frontend dependencies under persistent `NVH_HOME`; later launches use +the production server for faster startup. Use `nvh webui --dev` only when +editing the frontend.

nvHive WebUI walkthrough

-**GPU tier → model recommendations:** +**GPU tier model recommendations:** | VRAM | Text Model | Vision Model | Behavior | |------|-----------|-------------|----------| @@ -302,8 +411,11 @@ Results vary by hardware and workload — run `nvh bench` to measure on your set | Guide | Description | |-------|-------------| -| [Getting Started](docs/GETTING_STARTED.md) | First-time setup | | [Student GPU Cloud / Linux Desktop](docs/LINUX_DESKTOP.md) | No-root NVIDIA Linux workstation and ComfyUI guide | +| [Production Readiness](docs/PRODUCTION_READINESS.md) | Release gates and target NVIDIA Linux VM acceptance checklist | +| [Deploy Without Root](docs/DEPLOY_NO_ROOT.md) | No-root install on servers | +| [Windows Troubleshooting](docs/TROUBLESHOOTING_WINDOWS.md) | Encoding, segfaults, port issues | +| [Getting Started](docs/GETTING_STARTED.md) | General CLI/provider setup after the no-root workstation path | | [Commands](docs/COMMANDS.md) | Full CLI reference (50+ commands) | | [Providers](docs/PROVIDERS.md) | 23 providers, rate limits, free tiers | | [Council System](docs/COUNCIL.md) | Multi-LLM consensus with confidence scoring | @@ -313,8 +425,6 @@ Results vary by hardware and workload — run `nvh bench` to measure on your set | [Agent Tools](docs/TOOLS.md) | Agent tools and capabilities | | [Configuration](docs/CONFIGURATION.md) | Configuration reference | | [Web UI](docs/WEBUI.md) | Web dashboard | -| [Deploy Without Root](docs/DEPLOY_NO_ROOT.md) | No-root install on servers | -| [Windows Troubleshooting](docs/TROUBLESHOOTING_WINDOWS.md) | Encoding, segfaults, port issues | | [Releasing](docs/RELEASING.md) | Release runbook | --- diff --git a/docs/DEPLOY_NO_ROOT.md b/docs/DEPLOY_NO_ROOT.md index 0d97c8d..26c6a61 100644 --- a/docs/DEPLOY_NO_ROOT.md +++ b/docs/DEPLOY_NO_ROOT.md @@ -37,6 +37,11 @@ nvh nvidia # nvHive's GPU detection ## Ollama Without Root +For cloud desktops with persistent block storage, prefer the nvHive workstation +or Studio pack flow so model state is tied to `NVH_HOME`. The manual Ollama path +below is useful on generic no-root servers, but the default `~/.ollama` location +may be ephemeral on some managed desktops. + The standard Ollama installer (`curl | sh`) needs root. User-space alternative: @@ -62,7 +67,9 @@ ollama pull qwen2.5-coder:32b # ~34 GB, reviewer nvh agent --setup ``` -Models are stored in `~/.ollama/models/` — no root needed. +Models are stored in `~/.ollama/models/` by default, no root needed. On cloud +desktops, confirm that `$HOME` is persistent or set the equivalent model path to +your durable `NVH_HOME` volume before pulling large models. ## API Keys Without Keyring diff --git a/docs/GETTING_STARTED.md b/docs/GETTING_STARTED.md index ce4c91d..6044b8d 100644 --- a/docs/GETTING_STARTED.md +++ b/docs/GETTING_STARTED.md @@ -6,7 +6,12 @@ NVHive is a multi-LLM orchestration platform that routes queries to the best AI ## Quick Start (3 minutes) -### Option A: Docker (recommended for Ubuntu/Linux) +If you are on a no-root NVIDIA Linux cloud desktop or a session where only a +mounted block volume survives reconnects, start with the +[Student GPU Cloud / Linux Desktop guide](LINUX_DESKTOP.md) instead. That is +the rootless workstation path for local models, ComfyUI, and Studio packs. + +### Option A: Docker (local development / managed workstations) ```bash # Clone the repo @@ -27,13 +32,15 @@ open http://localhost:3000 # macOS That's it. The web UI is at **http://localhost:3000** and the API is at **http://localhost:8000**. -### Option B: One-line setup (Ubuntu with NVIDIA GPU) +### Option B: One-line setup (Ubuntu with NVIDIA GPU and install privileges) ```bash curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvhive/main/scripts/setup.sh | bash ``` -This installs Docker (rootless, no root needed), starts NVHive, and pulls a local AI model. +This prepares a Docker-backed local stack, starts NVHive, and pulls a local AI +model. Use the Linux Desktop guide instead when Docker or system changes are not +allowed. ### Option C: pip install (CLI only) diff --git a/docs/LINUX_DESKTOP.md b/docs/LINUX_DESKTOP.md index a0a6497..9238f47 100644 --- a/docs/LINUX_DESKTOP.md +++ b/docs/LINUX_DESKTOP.md @@ -7,10 +7,30 @@ Target user journey: 1. Launch the Linux desktop instance. 2. Run one install command or `pip install nvhive`. 3. Click the NVHive AI Studio desktop icon, or run `nvh workstation --launch`. -4. Use local chat models, cloud/free advisors, ComfyUI examples, agent packs, and game-dev helpers from one WebUI. +4. Use local chat models, cloud/free advisors, ComfyUI examples, OpenClaw/NemoClaw agent packs, game-dev helpers, creative tools, and music production helpers from one WebUI. ## Quick Start +Easiest path: + +```bash +curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/start-linux.sh | bash +``` + +That script chooses a likely persistent mount for `NVH_HOME`, installs nvHive +without root, creates the desktop launcher, and starts the WebUI setup wizard. +For the target cloud desktop shape, `NVH_HOME` should land on the writable +block-backed home/data volume that survives reconnects, ideally 200GB or +larger for local LLMs and ComfyUI assets. Avoid read-only CIFS/SMB mounts and +the ephemeral OS disk. +To force the no-Python binary path: + +```bash +curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/start-linux.sh | NVH_USE_BINARY=1 bash +``` + +Manual path: + ```bash export NVH_HOME=/mnt/persist/nvhive curl -sSL https://raw.githubusercontent.com/thatcooperguy/nvHive/main/install.sh | bash @@ -33,29 +53,40 @@ nvh workstation nvh webui ``` +The setup wizard starts in Beginner Mode with one recommended action, a Fix My +Setup repair button, and Advanced Details for diagnostics. It is designed so a +student can pick a mission, then click through storage, models, ComfyUI, Claw +agents, creative tools, game engines, and music packs without typing manual +commands. + ## What `nvh workstation` Does -- Detects NVIDIA GPU availability with `nvidia-smi` -- Estimates VRAM and recommends local chat models +- Detects NVIDIA GPU availability with NVML/`nvidia-smi` and reports when a + rootless session can see NVIDIA device files but cannot query them +- Estimates framebuffer/VRAM, architecture, and storage capacity before + recommending local chat models - Creates `$NVH_HOME/bin/nvhive-ai-studio` - Creates a Linux desktop launcher named `NVHive AI Studio` - Shows a student-friendly setup checklist +- Runs nvWizard boot checks for storage, Python, CUDA/PyTorch, ComfyUI, models, and install receipts - With `--all`, ensures local AI, installs ComfyUI, installs the rootless starter pack, and launches WebUI - Uses user-space paths only under `NVH_HOME` for durable models, ComfyUI, packs, runtime fallback tools, apps, WebUI assets, cache, logs, and config ## Rootless AI Studio Packs -`nvh studio` installs optional packs without root access. It never calls `sudo`, `apt`, `dnf`, `pacman`, `systemctl`, or Docker. +`nvh studio` installs optional packs without root access. It never calls `sudo`, `apt`, `dnf`, `pacman`, or `systemctl`. NemoClaw is the exception that checks Docker because it is an OpenShell sandbox stack; the wizard blocks it unless Docker already works without sudo. | Bundle | Command | Installs | | --- | --- | --- | -| Starter lab | `nvh studio --install starter -y` | Rootless Ollama, top local LLMs, agent lab, ComfyUI power nodes, game-dev lab | +| AI Starter | `nvh studio --install starter -y` | Rootless Ollama, top local LLMs, agent lab, ComfyUI power nodes, game-dev lab | | Runtime fallback | `nvh studio --install python-runtime-fallback -y` | Optional micromamba binary under `$NVH_HOME` for cloud images where Python `venv` is broken | | LLMs | `nvh studio --install llms -y` | Gemma 3, Qwen 3, Llama 3.1, Qwen coder, DeepSeek reasoning, embeddings | -| Agents | `nvh studio --install agents -y` | LangGraph, CrewAI, AutoGen, JupyterLab, search/tool packages | +| Agents | `nvh studio --install agents -y` | LangGraph, CrewAI, AutoGen, JupyterLab, search/tool packages, OpenClaw | +| Claw agents | `nvh studio --install claw -y` | OpenClaw rootless workspace, plus NVIDIA NemoClaw when Docker/OpenShell is usable | | ComfyUI | `nvh studio --install comfy -y` | ComfyUI Manager, Impact Pack, ControlNet Aux, Video Helper Suite, GGUF, rgthree | | Games | `nvh studio --install game -y` | Pygame/Panda3D lab, asset helpers, Linux/Wine mod workspace | | Creative | `nvh studio --install creative -y` | Blender 4.5 LTS portable install, launcher, game/asset workspace | +| Music | `nvh studio --install music -y` | ACE-Step music generator, Demucs stems, WhisperX transcription, Audacity/LMMS AppImages, and a DAW helper workspace | Run `nvh studio --list` to see exact pack status and disk estimates. @@ -85,8 +116,18 @@ refresh the browser, reconnect to a cloud desktop, or cancel a long download without losing the setup state. The local setup helper endpoint, `/v1/setup/helper`, works offline. It ranks the -next storage, runtime, model, ComfyUI, and creative-tool actions before any local -LLM is installed. +next storage, runtime, model, ComfyUI, OpenClaw/NemoClaw, creative-tool, and +music-tool actions before any local LLM is installed. + +OpenClaw is the simple agent option. nvHive installs it into a persistent +user-owned Node workspace and writes `nvhive-openclaw`. NemoClaw is the guarded +NVIDIA/OpenShell path. It remains visible in the wizard, but it is marked +blocked until Docker is installed, running, and reachable by the current user +without sudo. + +The council also includes a `product_resilience` preset with an Underdog Student +Advocate. Use it when you want a skeptical review of what could break for a +beginner on a no-root cloud GPU desktop. CLI equivalents: @@ -151,7 +192,9 @@ nvh studio --list # show rootless LLM/agent/ComfyUI/game packs nvh studio --models # show recommended local model downloads nvh studio --install-models recommended -y nvh studio --install starter -y +nvh studio --install claw -y nvh studio --install creative -y +nvh studio --install music -y nvh doctor --fix # repair local models/config where possible nvh webui # launch browser dashboard nvh safe "summarize this" # local-only prompt path diff --git a/docs/PRODUCTION_READINESS.md b/docs/PRODUCTION_READINESS.md new file mode 100644 index 0000000..6dc8cdc --- /dev/null +++ b/docs/PRODUCTION_READINESS.md @@ -0,0 +1,115 @@ +# Production Readiness + +nvHive can be CI-clean without being production-ready for the target cloud +desktop. The production bar is the real rootless NVIDIA Linux VM with persistent +block storage, because that is where drivers, CUDA, Python, storage, display, +and model downloads all meet. + +## Readiness States + +The setup API exposes a conservative report at: + +```bash +GET /v1/setup/production-readiness +``` + +The report returns: + +- `blocked`: one or more gates must be fixed before beta or production. +- `pilot-ready`: no hard blockers, but target VM validation or warnings remain. +- `production-ready`: all gates pass and the target VM acceptance flag is set. + +The report is intentionally conservative. It will not mark production-ready +until a real NVIDIA Linux VM test has been completed and +`NVH_TARGET_VM_VALIDATED=1` is present for the final check. + +Automatic storage detection is best-effort. nvHive can rank writable local +mount candidates and warn about obvious OS, network, read-only, or ephemeral +paths, but the target VM acceptance run is the final attestation that the chosen +`NVH_HOME` is the real persistent block-backed volume for that cloud desktop. + +## Gates + +The readiness report checks: + +- Persistent `NVH_HOME` is writable and explicitly configured. +- Mount autopilot can find or validate the persistent block-backed home. +- Python runtime can use normal venv/pip or the rootless micromamba fallback. +- The target Linux NVIDIA GPU session exposes driver, CUDA, and VRAM facts. +- App compatibility has no blocked items. +- Boot preflight has a stable baseline and no unexpected VM image drift. +- Smoke tests have no failures. +- Recommended local model queue fits persistent storage. +- Install receipts are healthy. +- All one-click Studio packs are marked no-root. +- The real target VM acceptance run has been completed. + +## Target VM Acceptance Checklist + +Run this on a fresh NVIDIA Linux cloud desktop without root access: + +1. Install from GitHub or PyPI into the user-owned persistent mount. +2. Confirm `NVH_HOME` lands on the 200 GB+ block-backed mount, not the OS disk + or a read-only share. +3. Run a real write probe under `$NVH_HOME` and confirm available capacity before + downloading large models or ComfyUI assets. +4. Launch the WebUI from the desktop launcher. +5. Install **AI Starter** and verify Ollama plus the recommended model queue. +6. Install **Graphics Creator Studio** and launch ComfyUI with starter examples. +7. Install **Game Dev Lab** and verify Blender/Godot helper launchers. +8. Install **Music Producer Studio** and verify helper workspaces without sudo. +9. Reboot or reconnect the VM and confirm boot preflight reports a stable image. +10. Run the readiness report again with: + +```bash +export NVH_TARGET_VM_VALIDATED=1 +nvh webui +``` + +Then open Advanced Details in the setup wizard and verify Release Readiness is +`production-ready`. + +## Logging and Error Reports + +The API attaches an `X-Request-ID` header to every response and includes that id +in structured logs. When `NVH_HOME` or `NVH_LOGS` is active, nvHive also writes +rootless logs under the persistent mount, usually: + +```bash +$NVH_HOME/logs/nvhive.log +``` + +The setup wizard has **Advanced Details -> Copy Error Report**. It calls: + +```bash +GET /v1/setup/diagnostics +``` + +That report includes storage status, release gates, recent setup jobs, install +receipts, safe environment facts, and recent warning/error log lines. API keys, +bearer tokens, GitHub tokens, and common secret-shaped values are redacted before +the report is shown or copied. + +If the WebUI does not open, use the CLI/headless path: + +```bash +nvh doctor --storage --home-dir "$NVH_HOME" +nvh doctor --fix +tail -n 120 "$NVH_HOME/logs/nvhive.log" +curl -s http://127.0.0.1:8000/v1/setup/diagnostics +``` + +Diagnostics redact API keys, bearer tokens, GitHub tokens, and common +secret-shaped values. They can still include usernames, mount names, project +paths, and recent warning/error log lines, so review reports before posting them +publicly. + +## Release Rule + +Use this language in releases: + +- Before target VM validation: "beta" or "pilot-ready". +- After the checklist passes on the NVIDIA Linux VM: "production-ready". + +Do not publish a PyPI release as production-ready if the report is still +`pilot-ready` or `blocked`. diff --git a/docs/RELEASING.md b/docs/RELEASING.md index c1ba842..5006a84 100644 --- a/docs/RELEASING.md +++ b/docs/RELEASING.md @@ -80,7 +80,9 @@ Before tagging, verify: git tag v0.7.0 git push origin v0.7.0 -# Publish the same tag to PyPI through trusted publishing. +# publish.yml normally starts automatically when release.yml publishes +# the GitHub Release. Only run this manually if the automatic publish +# did not start or you are recovering a failed publish. gh workflow run publish.yml --ref v0.7.0 -f target=pypi ``` @@ -89,10 +91,11 @@ The tag push triggers the release chain in the Actions tab: 1. **release.yml** builds sdist, wheel, and PyInstaller binaries for Linux/macOS/Windows, then creates a GitHub Release with auto-generated notes and every artifact attached. -2. **publish.yml** is dispatched against the same tag, runs - `twine check dist/*`, and uploads sdist + wheel to PyPI via OIDC +2. **publish.yml** normally starts from the `release: published` event, + runs `twine check dist/*`, and uploads sdist + wheel to PyPI via OIDC trusted publishing. This workflow file name is the one registered in - PyPI's trusted-publisher settings. + PyPI's trusted-publisher settings. Manual dispatch is a recovery path, + not a required second publish step. Typical full duration is 6-10 minutes for the GitHub Release (binary builds dominate) plus about a minute for PyPI publishing. On success, diff --git a/docs/WEBUI.md b/docs/WEBUI.md index 6366b29..29afe5d 100644 --- a/docs/WEBUI.md +++ b/docs/WEBUI.md @@ -7,6 +7,14 @@ nvh webui ``` The dashboard opens at `http://localhost:3000` and connects to the nvHive API automatically. +First launch installs dependencies and builds the WebUI under persistent `NVH_HOME`. +Later launches run the optimized production server. Use `nvh webui --dev` only +when developing the frontend. Pip and binary installs can fetch the WebUI with +`git` or, when `git` is absent, a GitHub source-archive fallback. The bootstrap +tries the installed release tag first and falls back to `main` only if needed. +The API allows local WebUI fallback ports automatically, so rootless launches on +`localhost`, `127.0.0.1`, `nvhive`, or loopback IPv6 keep working when the +preferred port is already occupied. ## Pages diff --git a/docs/screenshots/rootless-runtime.svg b/docs/screenshots/rootless-runtime.svg new file mode 100644 index 0000000..c5093e5 --- /dev/null +++ b/docs/screenshots/rootless-runtime.svg @@ -0,0 +1,71 @@ + + + + + + + + + + + + + + + Rootless NVIDIA Cloud Desktop Layout + nvHive assumes the base VM can change. The user's mounted block storage is the source of truth. + + + Base VM Image + Read-only or disposable + - Kernel + - NVIDIA driver + - CUDA runtime visibility + - System Python and Node + Can drift between sessions + + + + + + Persistent Block Storage + Auto-selected as NVH_HOME + + apps + + models + + runtimes + + jobs + + cache + + config + write/fsync/delete probe required + + + + + + Tools + ComfyUI + Ollama + Blender + Godot + Music AI + Agents + + + Boot preflight compares the new VM fingerprint to the previous one and recommends rootless repairs before large downloads run. + diff --git a/docs/screenshots/setup-flow.svg b/docs/screenshots/setup-flow.svg index 3d8e9e3..a56812f 100644 --- a/docs/screenshots/setup-flow.svg +++ b/docs/screenshots/setup-flow.svg @@ -1,87 +1,76 @@ - + - - + + + - - + + + + + + + - + + nvHive Mission Wizard + Pick one goal. nvWizard finds persistent storage, checks the GPU stack, recommends models, and installs rootlessly. - - FIRST-RUN SETUP - - 3 steps | No root | Fully automatic | Every step skippable + + + 1. Autopilot Boot + No admin password. No OS changes. + - Detects persistent block storage + - Probes write/fsync/delete safety + - Reads GPU, VRAM, CUDA, driver + - Tracks OS/kernel/runtime drift + + Fix My Setup is automatic - - - - - 1 - Hardware + Local AI + + - - Detect GPU, VRAM, CUDA - - Install Ollama to ~/.nvh/ - - Pull text + vision models - - Rich progress bars + + + 2. Pick Your Mission + The first screen is a choice, + not a checklist. - - DESKTOP AGENT: READY + + AI Starter + + Creator Studio + + Game Lab + + Music Studio + Advanced details stay one click away. - - - + + - - - - - 2 - Provider Status + + + 3. Rootless Install + nvHive downloads only what fits. + - LLMs and vision models + - ComfyUI workflows and nodes + - Blender, Godot, audio helpers + - OpenClaw and guarded NemoClaw + + Everything lives in NVH_HOME - Groq - not configured - OpenAI - not configured - Anthropic - not configured - Google - not configured - - - OLLAMA RUNNING 2 MODELS - - - - - - - - - - 3 - API Keys - (agent-assisted) - - - Opens signup page in browser - - Screenshots page to verify - - Watches clipboard for key - - Falls back to manual paste - - - DETECTED KEY: gsk_xR...CmLa SAVED - - - - SETUP COMPLETE | 4 PROVIDERS | 48 GB VRAM | DESKTOP AGENT READY + + Designed for NVIDIA Linux cloud desktops where the OS may reset but the mounted user volume survives. diff --git a/install.sh b/install.sh index aa0251a..33e5a2b 100755 --- a/install.sh +++ b/install.sh @@ -27,15 +27,114 @@ set -euo pipefail G='\033[0;32m'; Y='\033[1;33m'; B='\033[0;34m'; R='\033[0;31m'; D='\033[0;90m'; N='\033[0m' -if [ -z "${NVH_HOME:-}" ]; then +free_gb_for_path() { + df -Pk "$1" 2>/dev/null | awk 'NR==2 {printf "%d", $4 / 1048576}' +} + +score_nvh_home_candidate() { + local base="${1%/}" + [ -n "$base" ] || return 1 + [ -d "$base" ] || return 1 + [ -w "$base" ] || return 1 + + local name home free_gb score + name="$(basename "$base")" + home="$base/nvhive" + case "$name" in + nvh|nvhive|.nvh) home="$base" ;; + esac + + score=0 + case "$base" in + "$HOME"|"$HOME/"*) score=$((score - 15)) ;; + /mnt/*|/media/*|/workspace*|/data*|/persistent*|/storage*) score=$((score + 45)) ;; + esac + case "$base" in + *persist*|*Persist*|*workspace*|*Workspace*|*project*|*Project*|*data*|*Data*) + score=$((score + 20)) + ;; + *tmp*|*cache*|*Cache*) + score=$((score - 40)) + ;; + esac + free_gb="$(free_gb_for_path "$base")" + free_gb="${free_gb:-0}" + if [ "$free_gb" -ge 100 ]; then + score=$((score + 35)) + elif [ "$free_gb" -ge 50 ]; then + score=$((score + 30)) + elif [ "$free_gb" -ge 20 ]; then + score=$((score + 20)) + elif [ "$free_gb" -ge 10 ]; then + score=$((score + 8)) + else + score=$((score - 15)) + fi + if [ -f "$home/nvh-env.sh" ] || [ -d "$home/repo" ] || [ -d "$home/models" ]; then + score=$((score + 30)) + fi + printf '%s|%s\n' "$score" "$home" +} + +detect_nvh_home() { + local roots=() + local env_name env_value root child scored score home best_score best_home + if [ -d "$HOME/nvh/repo" ] && [ ! -d "$HOME/.nvh/repo" ]; then - NVH_HOME="$HOME/nvh" + printf '%s\n' "$HOME/nvh" + return 0 + fi + + for env_name in NVH_MOUNT PERSISTENT_HOME PERSISTENT_DIR PERSISTENT_STORAGE WORKSPACE PROJECTS PROJECT_HOME DATA_DIR; do + env_value="${!env_name:-}" + [ -n "$env_value" ] && roots+=("$env_value") + done + roots+=("/mnt" "/media/${USER:-}" "/workspace" "/data" "/persistent" "/storage") + + best_score=-999 + best_home="" + for root in "${roots[@]}"; do + [ -n "$root" ] || continue + if scored="$(score_nvh_home_candidate "$root")"; then + score="${scored%%|*}" + home="${scored#*|}" + if [ "$score" -gt "$best_score" ]; then + best_score="$score" + best_home="$home" + fi + fi + [ -d "$root" ] || continue + for child in "$root"/*; do + [ -d "$child" ] || continue + if scored="$(score_nvh_home_candidate "$child")"; then + score="${scored%%|*}" + home="${scored#*|}" + if [ "$score" -gt "$best_score" ]; then + best_score="$score" + best_home="$home" + fi + fi + done + done + + if [ -n "$best_home" ] && [ "$best_score" -ge 55 ]; then + printf '%s\n' "$best_home" + return 0 + fi + return 1 +} + +if [ -z "${NVH_HOME:-}" ]; then + if NVH_HOME="$(detect_nvh_home)"; then + NVH_HOME_AUTOPILOT=true else NVH_HOME="$HOME/.nvh" + NVH_HOME_AUTOPILOT=false fi NVH_HOME_CONFIGURED=false else NVH_HOME_CONFIGURED=true + NVH_HOME_AUTOPILOT=false fi NVH_VENV="$NVH_HOME/venv" NVH_REPO="$NVH_HOME/repo" @@ -94,9 +193,14 @@ echo "" # Find Python — check common locations since the VM may have it anywhere # --------------------------------------------------------------------------- if [ "$NVH_HOME_CONFIGURED" = "false" ]; then - echo -e "${Y}NVH_HOME was not set; using ${G}$NVH_HOME${N}" - echo -e "${D}For cloud desktops, set NVH_HOME to the mounted persistent file volume before install.${N}" - echo -e "${D}Example: export NVH_HOME=/mnt/persist/nvhive${N}" + if [ "$NVH_HOME_AUTOPILOT" = "true" ]; then + echo -e "${G}Mount autopilot selected ${NVH_HOME}${N}" + echo -e "${D}Override anytime with: export NVH_HOME=/path/on/persistent/mount${N}" + else + echo -e "${Y}NVH_HOME was not set; using ${G}$NVH_HOME${N}" + echo -e "${D}For cloud desktops, set NVH_HOME to the mounted persistent file volume before install.${N}" + echo -e "${D}Example: export NVH_HOME=/mnt/persist/nvhive${N}" + fi echo "" fi echo -e "${D}Persistent home: $NVH_HOME${N}" diff --git a/nvh/__init__.py b/nvh/__init__.py index 8dda29a..86ca04c 100644 --- a/nvh/__init__.py +++ b/nvh/__init__.py @@ -1,6 +1,6 @@ """NVHive — Multi-LLM Orchestration Platform.""" -__version__ = "0.34.0" +__version__ = "0.34.1" # SDK exports for Python usage from nvh.sdk import ( diff --git a/nvh/api/server.py b/nvh/api/server.py index aef0dd4..33f1cc3 100644 --- a/nvh/api/server.py +++ b/nvh/api/server.py @@ -14,6 +14,7 @@ import logging import os import time +import uuid from collections import defaultdict from collections.abc import AsyncGenerator from contextlib import asynccontextmanager @@ -33,7 +34,7 @@ status, ) from fastapi.middleware.cors import CORSMiddleware -from fastapi.responses import Response, StreamingResponse +from fastapi.responses import JSONResponse, Response, StreamingResponse from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer from pydantic import BaseModel, Field, field_validator @@ -45,10 +46,12 @@ ) from nvh.utils.gpu import ( check_oom_risk, + detect_gpu_status, detect_gpus, detect_system_memory, get_gpu_summary, get_ollama_optimizations, + gpu_architecture_info, recommend_models, ) @@ -273,6 +276,33 @@ def get_engine() -> Engine: return _engine +async def _run_boot_preflight_on_startup(app: FastAPI) -> None: + if os.environ.get("NVH_BOOT_PREFLIGHT", "1").lower() in {"0", "false", "off", "no"}: + logger.info("Hive API: boot preflight disabled by NVH_BOOT_PREFLIGHT.") + return + try: + from nvh.integrations.boot_preflight import run_boot_preflight + + result = await asyncio.to_thread(run_boot_preflight) + app.state.boot_preflight = result + logger.info("Hive API: boot preflight complete. %s", result.get("summary")) + except Exception as exc: + app.state.boot_preflight = { + "summary": "Boot preflight failed", + "error": str(exc), + "changes": [], + "agent_helper": { + "offline_helper_ready": True, + "local_agent_ready": False, + "mode": "offline-deterministic", + "recommended_action_id": "agent-lab", + "summary": "Offline setup helper is still available.", + "requirements": [], + }, + } + logger.warning("Hive API: boot preflight failed: %s", exc) + + # --------------------------------------------------------------------------- # Lifespan # --------------------------------------------------------------------------- @@ -280,6 +310,7 @@ def get_engine() -> Engine: @asynccontextmanager async def lifespan(app: FastAPI): global _engine + boot_task: asyncio.Task[None] | None = None from nvh.utils.logging import setup_logging json_mode = os.environ.get("HIVE_LOG_FORMAT", "text") == "json" setup_logging(level=os.environ.get("HIVE_LOG_LEVEL", "INFO"), json_format=json_mode) @@ -288,12 +319,16 @@ async def lifespan(app: FastAPI): _engine = Engine() enabled = await _engine.initialize() logger.info("Hive API: engine ready. Advisors: %s", ", ".join(enabled) or "none") + boot_task = asyncio.create_task(_run_boot_preflight_on_startup(app)) yield except Exception as exc: logger.error("Hive API: engine initialization error: %s", exc) + boot_task = asyncio.create_task(_run_boot_preflight_on_startup(app)) # Don't crash — partial init is fine; requests will fail gracefully. yield finally: + if boot_task and not boot_task.done(): + boot_task.cancel() logger.info("Hive API: shutting down.") if _engine: if hasattr(_engine, 'webhooks') and _engine.webhooks: @@ -331,16 +366,81 @@ async def lifespan(app: FastAPI): if _cors_env else _DEFAULT_CORS_ORIGINS ) +LOCAL_WEBUI_ORIGIN_REGEX = os.environ.get( + "HIVE_CORS_ORIGIN_REGEX", + r"^http://(localhost|127\.0\.0\.1|nvhive|\[::1\])(:\d+)?$", +) app.add_middleware( CORSMiddleware, allow_origins=ALLOWED_ORIGINS, + allow_origin_regex=LOCAL_WEBUI_ORIGIN_REGEX, allow_credentials=True, allow_methods=["*"], allow_headers=["*"], ) +@app.middleware("http") +async def request_logging_middleware(request: Request, call_next): + """Attach a request id, log latency, and return safe unexpected errors.""" + from nvh.utils.logging import reset_request_id, set_request_id + + request_id = request.headers.get("x-request-id") or uuid.uuid4().hex + token = set_request_id(request_id) + start = time.perf_counter() + try: + response = await call_next(request) + duration_ms = round((time.perf_counter() - start) * 1000, 2) + level = logging.WARNING if response.status_code >= 500 else logging.INFO + logger.log( + level, + "HTTP request complete", + extra={ + "request_id": request_id, + "method": request.method, + "path": request.url.path, + "status_code": response.status_code, + "duration_ms": duration_ms, + "client": request.client.host if request.client else "", + }, + ) + response.headers["X-Request-ID"] = request_id + return response + except Exception: + duration_ms = round((time.perf_counter() - start) * 1000, 2) + error_id = uuid.uuid4().hex[:12] + logger.exception( + "Unhandled API request error", + extra={ + "request_id": request_id, + "error_id": error_id, + "method": request.method, + "path": request.url.path, + "status_code": 500, + "duration_ms": duration_ms, + "client": request.client.host if request.client else "", + }, + ) + return JSONResponse( + status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, + headers={"X-Request-ID": request_id, "X-Error-ID": error_id}, + content={ + "status": "error", + "error": { + "type": "internal_server_error", + "message": "Internal server error", + "request_id": request_id, + "error_id": error_id, + }, + "detail": "Internal server error", + "request_id": request_id, + }, + ) + finally: + reset_request_id(token) + + # --------------------------------------------------------------------------- # Helpers # --------------------------------------------------------------------------- @@ -744,30 +844,45 @@ async def prometheus_metrics_v1() -> Response: def _serialize_gpu_data() -> dict[str, Any]: """Detect GPUs and return serialisable dict. Never raises — returns empty on error.""" try: - gpus = detect_gpus() + gpu_status = detect_gpu_status() + gpus = gpu_status["gpus"] sys_mem = detect_system_memory() summary = get_gpu_summary() total_vram_gb = round(sum(g.vram_mb for g in gpus) / 1024, 1) if gpus else 0.0 - gpu_list = [ - { - "name": g.name, - "vram_mb": g.vram_mb, - "vram_gb": g.vram_gb, - "memory_used_mb": g.memory_used_mb, - "memory_free_mb": g.memory_free_mb, - "utilization_pct": g.utilization_pct, - "driver_version": g.driver_version, - "cuda_version": g.cuda_version, - "index": g.index, - } - for g in gpus - ] + gpu_list = [] + for g in gpus: + arch = gpu_architecture_info(g) + gpu_list.append( + { + "name": g.name, + "vram_mb": g.vram_mb, + "vram_gb": g.vram_gb, + "memory_used_mb": g.memory_used_mb, + "memory_free_mb": g.memory_free_mb, + "memory_reserved_mb": max(g.vram_mb - g.memory_used_mb - g.memory_free_mb, 0), + "utilization_pct": g.utilization_pct, + "driver_version": g.driver_version, + "cuda_version": g.cuda_version, + "index": g.index, + "compute_capability": list(arch["compute_capability"]), + "compute_capability_source": arch["compute_capability_source"], + "architecture": arch["architecture"], + "architecture_heuristic": arch["heuristic"], + } + ) return { "gpus": gpu_list, "summary": summary, "total_vram_gb": total_vram_gb, + "detection": { + "status": gpu_status.get("status"), + "source": gpu_status.get("source"), + "issues": gpu_status.get("issues", []), + "device_files_present": gpu_status.get("device_files_present", False), + "nvidia_smi": gpu_status.get("nvidia_smi", ""), + }, "system_ram": { "total_gb": sys_mem.total_ram_gb, "available_gb": sys_mem.available_ram_gb, @@ -780,6 +895,13 @@ def _serialize_gpu_data() -> dict[str, Any]: "gpus": [], "summary": "GPU detection unavailable", "total_vram_gb": 0.0, + "detection": { + "status": "error", + "source": "exception", + "issues": [{"source": "api", "code": "exception", "message": str(exc), "severity": "warning", "detail": ""}], + "device_files_present": False, + "nvidia_smi": "", + }, "system_ram": {"total_gb": 0.0, "available_gb": 0.0, "effective_for_llm_gb": 0.0}, } @@ -871,6 +993,22 @@ def clean_home_dir(cls, value: str | None) -> str | None: return cleaned or None +class SetupAssistantRequest(BaseModel): + question: str = Field(..., min_length=1, max_length=2000) + home_dir: str | None = Field( + default=None, + description="Optional NVH_HOME candidate to use while answering setup questions", + ) + + +class SetupHomeRequest(BaseModel): + home_dir: str | None = Field( + default=None, + description="Optional NVH_HOME on the persistent mounted volume", + ) + min_free_gb: float = Field(default=20.0, ge=0) + + @app.get("/v1/system/storage", summary="Inspect rootless persistent storage") async def system_storage( home_dir: str | None = None, @@ -902,6 +1040,33 @@ async def configure_system_storage( ) +@app.get("/v1/system/mount-autopilot", summary="Detect likely persistent mounts") +async def system_mount_autopilot( + min_free_gb: float = 20.0, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Recommend a persistent NVH_HOME without requiring root access.""" + from nvh.integrations.mount_autopilot import mount_autopilot_report + + return _response_envelope(mount_autopilot_report(min_free_gb=min_free_gb)) + + +@app.post("/v1/system/mount-autopilot/activate", summary="Activate the best persistent mount") +async def system_mount_autopilot_activate( + request: SetupHomeRequest, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Create and activate the highest-scoring discovered NVH_HOME.""" + from nvh.integrations.mount_autopilot import activate_recommended_mount + + return _response_envelope( + activate_recommended_mount( + min_free_gb=request.min_free_gb, + extra_roots=[request.home_dir] if request.home_dir else None, + ) + ) + + @app.get("/v1/system/runtime", summary="Inspect rootless runtime fallback status") async def system_runtime(_auth: None = Depends(require_auth)) -> dict[str, Any]: """Return whether nvHive can use Python venv/pip or needs micromamba fallback.""" @@ -921,6 +1086,233 @@ async def setup_helper( return _response_envelope(setup_helper_report(home_dir=home_dir)) +@app.get("/v1/setup/mission-control", summary="Return nvWizard setup mission timeline") +async def setup_mission_control( + home_dir: str | None = None, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Return the combined boot, repair, model-fit, and smoke-test timeline.""" + from nvh.integrations.mission_control import mission_control_report + + return _response_envelope(mission_control_report(home_dir=home_dir)) + + +@app.get("/v1/setup/auto-repair", summary="Preview safe rootless auto-repairs") +async def setup_auto_repair( + home_dir: str | None = None, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Return the safe repair queue without running installers or downloads.""" + from nvh.integrations.auto_repair import auto_repair_plan + + return _response_envelope(auto_repair_plan(home_dir=home_dir)) + + +@app.post("/v1/setup/repair-workspace", summary="Run safe rootless workspace repairs") +async def setup_repair_workspace( + request: SetupHomeRequest, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Run idempotent repairs that leave user files, models, and app data intact.""" + from nvh.integrations.auto_repair import run_safe_repairs + + return _response_envelope(run_safe_repairs(home_dir=request.home_dir)) + + +@app.get("/v1/setup/smoke-tests", summary="Run lightweight app smoke checks") +async def setup_smoke_tests( + home_dir: str | None = None, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Check installed apps without running destructive actions.""" + from nvh.integrations.smoke_tests import smoke_test_report + + return _response_envelope(smoke_test_report(home_dir=home_dir)) + + +@app.get("/v1/setup/model-fit", summary="Recommend models by VRAM and disk fit") +async def setup_model_fit( + home_dir: str | None = None, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Return the simplified student model queue and fit scores.""" + from nvh.integrations.model_fit import model_fit_report + + return _response_envelope(model_fit_report(home_dir=home_dir)) + + +@app.get("/v1/setup/production-readiness", summary="Return conservative production readiness gates") +async def setup_production_readiness( + home_dir: str | None = None, + target_vm_validated: bool | None = None, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Aggregate rootless, storage, model, smoke, and target-VM release gates.""" + from nvh.integrations.production_readiness import production_readiness_report + + return _response_envelope( + production_readiness_report( + home_dir=home_dir, + target_vm_validated=target_vm_validated, + ) + ) + + +@app.get("/v1/setup/diagnostics", summary="Return a redacted setup diagnostics report") +async def setup_diagnostics( + request: Request, + home_dir: str | None = None, + include_logs: bool = True, + log_lines: int = 80, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Package rootless setup state, recent jobs, and redacted log warnings.""" + from nvh.integrations.diagnostics import diagnostics_report + + report = diagnostics_report( + home_dir=home_dir, + request_id=request.headers.get("x-request-id"), + include_logs=include_logs, + log_lines=log_lines, + ) + logger.info( + "Setup diagnostics report generated", + extra={ + "request_id": report.get("request_id") or "", + "path": request.url.path, + }, + ) + return _response_envelope(report) + + +@app.post("/v1/setup/assistant", summary="Ask the local setup helper") +async def setup_assistant( + request: SetupAssistantRequest, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Answer setup questions with local state, receipts, and deterministic rules.""" + from nvh.integrations.setup_agent import setup_assistant_reply + + return _response_envelope( + setup_assistant_reply(request.question, home_dir=request.home_dir) + ) + + +@app.get("/v1/setup/catalog", summary="Load the setup catalog") +async def setup_catalog( + refresh: bool = False, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Return remote/cache/bundled setup catalog data for the wizard.""" + from nvh.integrations.catalog import load_setup_catalog + + return _response_envelope(load_setup_catalog(refresh=refresh)) + + +@app.get("/v1/setup/compatibility", summary="Inspect host/app compatibility") +async def setup_compatibility( + home_dir: str | None = None, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Return host facts and per-app compatibility checks for nvWizard.""" + from nvh.integrations.compatibility import compatibility_report + + return _response_envelope(compatibility_report(home_dir=home_dir)) + + +@app.get("/v1/setup/boot-preflight", summary="Return boot-time VM image preflight") +async def setup_boot_preflight( + home_dir: str | None = None, + recheck: bool = False, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Return the persisted boot preflight, running it if needed.""" + from nvh.integrations.boot_preflight import boot_preflight_status, run_boot_preflight + + if recheck: + result = await asyncio.to_thread(run_boot_preflight, home_dir=home_dir) + app.state.boot_preflight = result + return _response_envelope(result) + + cached = getattr(app.state, "boot_preflight", None) + if cached and not home_dir: + return _response_envelope(cached) + result = await asyncio.to_thread(boot_preflight_status, home_dir=home_dir, run_if_missing=True) + return _response_envelope(result) + + +@app.post("/v1/setup/boot-preflight/recheck", summary="Run boot-time VM image preflight now") +async def setup_boot_preflight_recheck( + home_dir: str | None = None, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Force a boot preflight refresh after the user repairs something.""" + from nvh.integrations.boot_preflight import run_boot_preflight + + result = await asyncio.to_thread(run_boot_preflight, home_dir=home_dir) + app.state.boot_preflight = result + return _response_envelope(result) + + +@app.get("/v1/setup/receipts", summary="List rootless install receipts") +async def setup_receipts( + kind: str | None = None, + status_filter: str | None = None, + limit: int = 100, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + """Return install receipts written under NVH_HOME.""" + from nvh.integrations.receipts import list_receipts, receipt_summary + + safe_limit = max(1, min(limit, 500)) + receipts = list_receipts(kind=kind, status=status_filter, limit=safe_limit) + summary = receipt_summary() + return _response_envelope({ + "receipts": receipts, + "count": len(receipts), + "summary": {key: value for key, value in summary.items() if key != "receipts"}, + }) + + +def _receipt_or_404(receipt_id: str) -> dict[str, Any]: + from nvh.integrations.receipts import load_receipt + + try: + return load_receipt(receipt_id) + except KeyError as exc: + raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(exc)) from exc + + +@app.get("/v1/setup/receipts/{receipt_id}", summary="Get one install receipt") +async def setup_receipt( + receipt_id: str, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + return _response_envelope(_receipt_or_404(receipt_id)) + + +@app.get("/v1/setup/receipts/{receipt_id}/repair-plan", summary="Preview receipt repair") +async def setup_receipt_repair_plan( + receipt_id: str, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + from nvh.integrations.receipts import repair_plan + + _receipt_or_404(receipt_id) + return _response_envelope(repair_plan(receipt_id)) + + +@app.get("/v1/setup/receipts/{receipt_id}/uninstall-plan", summary="Preview receipt uninstall") +async def setup_receipt_uninstall_plan( + receipt_id: str, + _auth: None = Depends(require_auth), +) -> dict[str, Any]: + from nvh.integrations.receipts import uninstall_plan + + _receipt_or_404(receipt_id) + return _response_envelope(uninstall_plan(receipt_id)) + + # -- /v1/system/info ---------------------------------------------------------- @app.get("/v1/system/info", summary="Combined system status — GPU + providers + budget in one call") @@ -2422,7 +2814,7 @@ async def comfyui_start( class StudioPackInstallRequest(BaseModel): pack_ids: list[str] = Field( default_factory=lambda: ["starter"], - description="Studio pack ids or bundle names: starter, all, llms, agents, comfy, game", + description="Studio pack ids or bundle names: starter, all, llms, agents, comfy, game, creative, music", ) force_update: bool = False @@ -2498,7 +2890,7 @@ async def studio_pack_install( request: StudioPackInstallRequest, _auth: None = Depends(require_auth), ) -> StreamingResponse: - """Install LLM, agent, ComfyUI, and game-dev packs without root access.""" + """Install LLM, agent, ComfyUI, game-dev, creative, and music packs without root access.""" return StreamingResponse( _studio_pack_install_stream(request), media_type="text/event-stream", diff --git a/nvh/catalog/__init__.py b/nvh/catalog/__init__.py new file mode 100644 index 0000000..8c6a7a3 --- /dev/null +++ b/nvh/catalog/__init__.py @@ -0,0 +1 @@ +"""Bundled nvHive setup catalog.""" diff --git a/nvh/catalog/nvhive-catalog.json b/nvh/catalog/nvhive-catalog.json new file mode 100644 index 0000000..3673515 --- /dev/null +++ b/nvh/catalog/nvhive-catalog.json @@ -0,0 +1,162 @@ +{ + "schema_version": 1, + "updated_at": "2026-04-28T00:00:00Z", + "channel": "bundled", + "profiles": [ + { + "id": "student", + "title": "AI Starter", + "description": "First-time local AI lab for chat, homework, coding help, GitHub, and the helper agent.", + "pack_ids": ["starter"], + "model_ids": ["gemma3-4b", "qwen3-8b", "nomic-embed-text"] + }, + { + "id": "creator", + "title": "Graphics Creator Studio", + "description": "ComfyUI, Blender, and vision helpers for graphics, image, video, and 3D projects.", + "pack_ids": ["creative", "comfy"], + "model_ids": ["gemma3-4b", "llava-7b"] + }, + { + "id": "music", + "title": "Music Producer Studio", + "description": "AI music generation, stem separation, transcription, audio cleanup, and rootless DAW helpers.", + "pack_ids": ["music"], + "model_ids": ["gemma3-4b"] + }, + { + "id": "agent", + "title": "Agent Builder", + "description": "Local agent libraries, a coding model, and embeddings.", + "pack_ids": ["agents"], + "model_ids": ["qwen25-coder-7b", "nomic-embed-text"] + }, + { + "id": "game", + "title": "Game Dev Lab", + "description": "Game prototyping, Blender assets, and mod helper workspace.", + "pack_ids": ["game", "creative"], + "model_ids": ["qwen25-coder-7b", "llava-7b"] + }, + { + "id": "full", + "title": "Power User Workstation", + "description": "Everything nvHive can install without root access, guarded by host checks.", + "pack_ids": ["all"], + "model_ids": ["recommended"] + } + ], + "packs": [ + { + "id": "rootless-ollama", + "title": "Rootless Ollama Runtime", + "category": "runtime", + "install_command": "nvh studio --install rootless-ollama -y", + "recommended": true + }, + { + "id": "llm-starter", + "title": "Top Local LLM Starter", + "category": "llm", + "install_command": "nvh studio --install llm-starter -y", + "recommended": true + }, + { + "id": "agent-lab", + "title": "Agent Lab", + "category": "agent", + "install_command": "nvh studio --install agent-lab -y", + "recommended": true + }, + { + "id": "comfyui-power-nodes", + "title": "ComfyUI Power Nodes", + "category": "comfyui", + "install_command": "nvh studio --install comfy -y", + "recommended": true + }, + { + "id": "blender-creative", + "title": "Blender Creative Studio", + "category": "creative", + "latest_version": "4.5.4", + "install_command": "nvh studio --install creative -y", + "recommended": true + }, + { + "id": "ace-step-music", + "title": "ACE-Step Music Generator", + "category": "music", + "install_command": "nvh studio --install music -y", + "recommended": true + }, + { + "id": "music-producer-lab", + "title": "Music Producer AI Lab", + "category": "music", + "install_command": "nvh studio --install music -y", + "recommended": true + }, + { + "id": "music-daw-helper", + "title": "Rootless DAW Helper", + "category": "music", + "install_command": "nvh studio --install music -y", + "recommended": true + } + ], + "models": [ + { + "id": "gemma3-4b", + "title": "Gemma 3 4B", + "provider": "ollama", + "install_target": "gemma3:4b", + "recommended_vram_gb": 6, + "category": "chat" + }, + { + "id": "qwen3-8b", + "title": "Qwen 3 8B", + "provider": "ollama", + "install_target": "qwen3:8b", + "recommended_vram_gb": 8, + "category": "chat" + }, + { + "id": "qwen25-coder-7b", + "title": "Qwen 2.5 Coder 7B", + "provider": "ollama", + "install_target": "qwen2.5-coder:7b", + "recommended_vram_gb": 8, + "category": "code" + }, + { + "id": "nomic-embed-text", + "title": "Nomic Embed Text", + "provider": "ollama", + "install_target": "nomic-embed-text", + "recommended_vram_gb": 0, + "category": "embedding" + } + ], + "comfyui_examples": [ + { + "id": "z-image-turbo-text-to-image", + "title": "Z-Image-Turbo Text to Image", + "category": "text-to-image", + "recommended_vram_gb": 8 + }, + { + "id": "wan22-5b-video-generation", + "title": "Wan 2.2 5B Video Generation", + "category": "text-to-video", + "recommended_vram_gb": 8 + }, + { + "id": "flux-controlnet-canny-depth", + "title": "FLUX.1 ControlNet Canny and Depth", + "category": "controlnet", + "recommended_vram_gb": 16 + } + ] +} diff --git a/nvh/cli/main.py b/nvh/cli/main.py index 80ec5d4..38e8114 100644 --- a/nvh/cli/main.py +++ b/nvh/cli/main.py @@ -19,6 +19,7 @@ import webbrowser from decimal import Decimal from pathlib import Path +from typing import Any # ---------------------------------------------------------------------- # Windows asyncio proactor GC crash workaround @@ -1693,7 +1694,7 @@ def convene_cmd( preset: str | None = typer.Option( None, "--cabinet", help="Agent cabinet: executive, engineering," - " security_review, code_review, product, data, full_board", + " security_review, code_review, product, product_resilience, data, full_board", ), num_agents: int | None = typer.Option( None, "--num-agents", "-n", @@ -7124,7 +7125,11 @@ def _is_package_installed(module_name: str) -> bool: return importlib.util.find_spec(module_name) is not None -def _try_install_node_no_root(console: Console) -> tuple[str | None, str | None]: +def _try_install_node_no_root( + console: Console, + *, + assume_yes: bool = False, +) -> tuple[str | None, str | None]: """Offer to auto-install Node.js into the user's home without root. Uses ``fnm`` (Fast Node Manager) - a single-binary, no-root Node manager: @@ -7134,7 +7139,8 @@ def _try_install_node_no_root(console: Console) -> tuple[str | None, str | None] calls in ``nvh webui`` find it. Returns ``(node_path, npm_path)`` on success, ``(None, None)`` otherwise. - Interactive: prompts the user before making any network call. + Interactive by default: prompts the user before making any network call. + Pass ``assume_yes=True`` from one-click setup paths. Windows is intentionally out of scope — users there should use winget (which requires admin for some installs but is the native path). @@ -7155,14 +7161,15 @@ def _try_install_node_no_root(console: Console) -> tuple[str | None, str | None] env.update(layout.env()) env["FNM_DIR"] = str(fnm_root) - try: - answer = console.input( - f" Auto-install Node.js into {fnm_root} via fnm (no root)? [Y/n] " - ).strip().lower() - except (EOFError, KeyboardInterrupt): - return None, None - if answer in ("n", "no"): - return None, None + if not assume_yes: + try: + answer = console.input( + f" Auto-install Node.js into {fnm_root} via fnm (no root)? [Y/n] " + ).strip().lower() + except (EOFError, KeyboardInterrupt): + return None, None + if answer in ("n", "no"): + return None, None # Step 1: install fnm if missing. The official installer writes to # FNM_DIR and optionally edits shell rc; we bypass the rc @@ -7581,7 +7588,7 @@ def studio( None, "--install", "-i", - help="Install a pack id or bundle: starter, all, llms, agents, comfy, game", + help="Install a pack id or bundle: starter, all, llms, agents, claw, comfy, game, creative, music", ), install_models: str | None = typer.Option( None, @@ -7596,7 +7603,7 @@ def studio( ), yes: bool = typer.Option(False, "-y", "--yes", help="Skip confirmation prompts"), ): - """Install rootless AI Studio packs for LLMs, agents, ComfyUI, and games.""" + """Install rootless AI Studio packs for LLMs, agents, ComfyUI, games, and music.""" from nvh.integrations.storage import ensure_storage from nvh.integrations.studio_packs import ( catalog_with_status, @@ -7753,7 +7760,7 @@ def workstation( with_studio_packs: bool = typer.Option( False, "--with-studio-packs", - help="Install rootless LLM, agent, ComfyUI-node, and game-dev packs", + help="Install rootless LLM, agent, ComfyUI-node, game-dev, creative, and music packs", ), port: int = typer.Option(3000, "--port", help="WebUI port"), api_port: int = typer.Option(8000, "--api-port", help="API server port"), @@ -7800,6 +7807,18 @@ def workstation( storage = ensure_storage(home_dir, min_free_gb=min_free_gb) profile = detect_workstation_profile(home_dir=storage.layout.home) + boot_report: dict[str, Any] | None = None + recommended_torch_profile = "nvidia-cu121" + try: + from nvh.integrations.boot_preflight import run_boot_preflight + + boot_report = run_boot_preflight(home_dir=storage.layout.home) + recommended_torch_profile = ( + boot_report.get("compatibility", {}).get("recommended_torch_profile") + or recommended_torch_profile + ) + except Exception as exc: + console.print(f" [yellow]![/yellow] Boot preflight skipped: {exc}") console.print("\n[bold green]NVHive Student Workstation[/bold green]") console.print(" [dim]Target: Linux GPU desktop or forwarded cloud session[/dim]\n") console.print(f" NVH_HOME: [bold]{storage.layout.home}[/bold]") @@ -7820,6 +7839,17 @@ def workstation( if profile.recommended_chat_models: console.print(f" Chat models: {', '.join(profile.recommended_chat_models)}") console.print(f" ComfyUI: {', '.join(profile.recommended_comfy_profiles)} profiles\n") + if boot_report: + agent_helper = boot_report.get("agent_helper", {}) + console.print(f" Boot check: [bold]{boot_report.get('summary')}[/bold]") + console.print(f" AI helper: {agent_helper.get('summary', 'Offline setup helper is available.')}") + if boot_report.get("changes"): + for change in boot_report.get("changes", [])[:5]: + console.print( + f" [yellow]![/yellow] {change.get('label')}: " + f"{change.get('before')} -> {change.get('after')}" + ) + console.print() for note in profile.notes: console.print(f" [yellow]![/yellow] {note}") @@ -7900,7 +7930,7 @@ async def _install_comfy() -> None: from nvh.integrations.comfyui import install_comfyui last_log = 0.0 - async for event in install_comfyui(torch_profile="nvidia-cu130"): + async for event in install_comfyui(torch_profile=recommended_torch_profile): kind = event.get("event", "") message = event.get("message", "") now = _time.monotonic() @@ -7959,6 +7989,7 @@ async def _install_packs() -> None: yes=True, no_api=False, api_port=api_port, + dev=False, ) @@ -7986,11 +8017,17 @@ def webui( 8000, "--api-port", help="Port the API server is expected to listen on", ), + dev: bool = typer.Option( + False, "--dev", + help="Run the Next.js development server instead of the production server", + ), ): """Install and launch the nvHive web UI. The web UI is optional — nvHive works fully from the CLI. - This command installs Node.js dependencies and starts the Next.js dev server. + This command installs Node.js dependencies, builds the WebUI when needed, + and starts the optimized Next.js production server. Use --dev only when + editing the WebUI source. First run installs dependencies (~30 seconds). Subsequent runs start instantly. @@ -7999,6 +8036,7 @@ def webui( nvh webui # install (if needed) and launch on port 3000 nvh webui --install # install dependencies only nvh webui --port 8080 # launch on a different port + nvh webui --dev # run Next.js dev mode for WebUI development nvh webui --clean # wipe node_modules/.next, keep source nvh webui --uninstall # remove the downloaded Web UI entirely """ @@ -8134,47 +8172,133 @@ def _safety_check(target: str) -> None: web_dir = candidate break + web_ref = os.environ.get("NVH_WEB_REF") or f"v{__version__}" + + def _download_webui_zip(destination: str, ref: str) -> bool: + """Download web/ from GitHub without requiring git.""" + import zipfile + from urllib.request import Request, urlopen + + ref_kind = "heads" if ref in {"main", "master"} or "/" in ref else "tags" + zip_url = f"https://github.com/thatcooperguy/nvHive/archive/refs/{ref_kind}/{ref}.zip" + zip_path = destination + ".zip" + extract_dir = destination + ".ziptmp" + try: + for path in (zip_path, extract_dir): + if os.path.isdir(path): + shutil.rmtree(path, ignore_errors=True) + elif os.path.exists(path): + os.remove(path) + + req = Request(zip_url, headers={"User-Agent": "nvhive-webui-bootstrap"}) + with urlopen(req, timeout=120) as response, open(zip_path, "wb") as fh: + while True: + chunk = response.read(1024 * 1024) + if not chunk: + break + fh.write(chunk) + + os.makedirs(extract_dir, exist_ok=True) + extract_root = os.path.abspath(extract_dir) + with zipfile.ZipFile(zip_path) as zf: + for member in zf.infolist(): + target = os.path.abspath(os.path.join(extract_dir, member.filename)) + if not target.startswith(extract_root + os.sep) and target != extract_root: + raise ValueError(f"unsafe zip member: {member.filename}") + zf.extractall(extract_dir) + + src_web = "" + for name in os.listdir(extract_dir): + candidate_web = os.path.join(extract_dir, name, "web") + if os.path.isfile(os.path.join(candidate_web, "package.json")): + src_web = candidate_web + break + if not src_web: + console.print("[red]Downloaded archive has no web/ directory.[/red]") + return False + + if os.path.isdir(destination): + shutil.rmtree(destination, ignore_errors=True) + shutil.move(src_web, destination) + return True + except Exception as exc: + console.print(f"[red]Web UI archive download failed:[/red] {exc}") + return False + finally: + if os.path.exists(zip_path): + try: + os.remove(zip_path) + except OSError: + pass + if os.path.isdir(extract_dir): + shutil.rmtree(extract_dir, ignore_errors=True) + if not web_dir: # Attempt to download the web/ directory from the upstream repo # so pip-installed users get a working `nvh webui` out of the box. git = shutil.which("git") - if not git: - console.print("[red]Web UI not found.[/red]") - console.print( - "Install git so nvHive can fetch the Web UI, " - "or install nvHive from source (git clone)." - ) - raise typer.Exit(1) console.print("[bold]Downloading Web UI (first run)...[/bold]") os.makedirs(os.path.dirname(cache_web_dir), exist_ok=True) tmp_clone = cache_web_dir + ".tmp" if os.path.isdir(tmp_clone): shutil.rmtree(tmp_clone, ignore_errors=True) - result = subprocess.run( - [ - "git", "clone", "--depth", "1", - "https://github.com/thatcooperguy/nvHive.git", - tmp_clone, - ], - capture_output=True, - text=True, - env=webui_env, - ) - if result.returncode != 0: - console.print("[red]Failed to download Web UI.[/red]") - console.print(f"[dim]{result.stderr.strip()}[/dim]") - raise typer.Exit(1) + downloaded = False + if git: + result = subprocess.run( + [ + git, "clone", "--depth", "1", "--branch", web_ref, + "https://github.com/thatcooperguy/nvHive.git", + tmp_clone, + ], + capture_output=True, + text=True, + env=webui_env, + ) + if result.returncode != 0 and web_ref != "main": + console.print( + f"[yellow]Could not fetch WebUI ref {web_ref}; trying main.[/yellow]" + ) + if os.path.isdir(tmp_clone): + shutil.rmtree(tmp_clone, ignore_errors=True) + result = subprocess.run( + [ + git, "clone", "--depth", "1", "--branch", "main", + "https://github.com/thatcooperguy/nvHive.git", + tmp_clone, + ], + capture_output=True, + text=True, + env=webui_env, + ) + if result.returncode == 0: + src_web = os.path.join(tmp_clone, "web") + if os.path.isdir(src_web): + if os.path.isdir(cache_web_dir): + shutil.rmtree(cache_web_dir, ignore_errors=True) + shutil.move(src_web, cache_web_dir) + downloaded = True + else: + console.print("[yellow]Downloaded repo has no web/ directory.[/yellow]") + else: + console.print("[yellow]git clone failed; trying GitHub archive fallback.[/yellow]") + stderr = result.stderr.strip() + if stderr: + console.print(f"[dim]{stderr}[/dim]") - src_web = os.path.join(tmp_clone, "web") - if not os.path.isdir(src_web): - console.print("[red]Upstream repo has no web/ directory.[/red]") - raise typer.Exit(1) + if not downloaded: + downloaded = _download_webui_zip(cache_web_dir, web_ref) + if not downloaded and web_ref != "main": + console.print("[yellow]Trying WebUI archive from main.[/yellow]") + downloaded = _download_webui_zip(cache_web_dir, "main") - if os.path.isdir(cache_web_dir): - shutil.rmtree(cache_web_dir, ignore_errors=True) - shutil.move(src_web, cache_web_dir) shutil.rmtree(tmp_clone, ignore_errors=True) + if not downloaded: + console.print("[red]Failed to download Web UI.[/red]") + console.print( + "Check network access, or install from a GitHub release/source checkout." + ) + raise typer.Exit(1) web_dir = cache_web_dir console.print(f"[green]Web UI downloaded to {cache_web_dir}[/green]") @@ -8198,7 +8322,7 @@ def _safety_check(target: str) -> None: # use fnm (Fast Node Manager): single-binary installer, drops Node # under ~/.local/share/fnm and adds to PATH for this process. # Windows stays with winget guidance (requires user action). - node, npm = _try_install_node_no_root(console) + node, npm = _try_install_node_no_root(console, assume_yes=yes) if not node or not npm: console.print("[red]Auto-install failed or declined.[/red]") console.print("Install Node.js 18+:") @@ -8239,6 +8363,25 @@ def _safety_check(target: str) -> None: raise typer.Exit(1) console.print("[green]Dependencies installed.[/green]") + def _web_build_ready(path: str) -> bool: + return os.path.isfile(os.path.join(path, ".next", "BUILD_ID")) + + if not dev and not _web_build_ready(web_dir): + console.print("[bold]Building optimized Web UI (first run)...[/bold]") + result = subprocess.run( + [npm, "run", "build"], + cwd=web_dir, + env=webui_env, + ) + if result.returncode != 0: + console.print("[red]Web UI build failed.[/red]") + console.print( + "Run [bold]nvh webui --clean[/bold] and try again, " + "or use [bold]nvh webui --dev[/bold] while developing." + ) + raise typer.Exit(1) + console.print("[green]Optimized Web UI build ready.[/green]") + if install_only: console.print("[green]Web UI ready. Run 'nvh webui' to launch.[/green]") return @@ -8386,11 +8529,8 @@ def _api_reachable(p: int, timeout: float = 0.5) -> bool: console.print() try: - subprocess.run( - [npm, "run", "dev", "--", "-p", str(chosen_port)], - cwd=web_dir, - env=webui_env, - ) + command = [npm, "run", "dev" if dev else "start", "--", "-p", str(chosen_port)] + subprocess.run(command, cwd=web_dir, env=webui_env) except KeyboardInterrupt: console.print("\n[dim]Web UI stopped.[/dim]") finally: @@ -9085,6 +9225,40 @@ def _fail(check: str, detail: str = "", fix: str = "") -> None: "Run `nvh doctor --storage --home-dir /path/on/mounted/volume/nvhive`", ) + try: + from nvh.integrations.receipts import receipt_summary + + receipts = receipt_summary() + detail = ( + f"{receipts['count']} receipt(s), " + f"{receipts['unhealthy']} need attention, root {receipts['root']}" + ) + if receipts["unhealthy"]: + _warn( + "Install receipts", + detail, + "Open the setup wizard or rerun the matching `nvh studio` / `nvh workstation` command.", + ) + else: + _pass("Install receipts", detail) + except Exception as e: + _warn("Install receipts", str(e)) + + try: + from nvh.integrations.catalog import catalog_status + + catalog = catalog_status(refresh=False) + detail = ( + f"{catalog.get('source')} catalog, {catalog.get('profile_count', 0)} profiles, " + f"{catalog.get('model_count', 0)} models" + ) + if catalog.get("error"): + _warn("Setup catalog", f"{detail}; {catalog['error']}") + else: + _pass("Setup catalog", detail) + except Exception as e: + _warn("Setup catalog", str(e)) + # 2. Config file exists and is valid YAML from nvh.config.settings import DEFAULT_CONFIG_PATH if not DEFAULT_CONFIG_PATH.exists(): diff --git a/nvh/core/agent_matching.py b/nvh/core/agent_matching.py index 565fcb1..dc28c2d 100644 --- a/nvh/core/agent_matching.py +++ b/nvh/core/agent_matching.py @@ -67,6 +67,7 @@ class AgentAssignment: "Open Source Maintainer": ["code_review", "analysis", "creative"], "Blockchain/Web3 Engineer": ["code_generation", "security", "reasoning"], "Game Developer": ["code_generation", "optimization", "creative"], + "Underdog Student Advocate": ["analysis", "debugging", "testing", "devops"], } # --------------------------------------------------------------------------- diff --git a/nvh/core/agents.py b/nvh/core/agents.py index 540561d..6a73632 100644 --- a/nvh/core/agents.py +++ b/nvh/core/agents.py @@ -140,6 +140,23 @@ class PersonaTemplate: triggers=["ux", "user experience", "design", "wireframe", "prototype", "usability", "accessibility", "information architecture", "user flow", "persona"], ), + PersonaTemplate( + role="Underdog Student Advocate", + expertise="beginner onboarding, rootless Linux cloud desktops, self-healing setup, failure-mode discovery", + perspective="skeptical review from a smart student with no sudo, limited time, shifting VM images, and one persistent file mount", + triggers=["student", "beginner", "wizard", "self-healing", "self healing", "rootless", + "install", "setup", "comfyui", "gpu", "driver", "cuda", "mount", + "persistent", "cloud desktop", "geforce now", "easy to use", "broken"], + weight_boost=0.18, + system_prompt=( + "You are the **Underdog Student Advocate**. Your job is to be the useful skeptic " + "in the room: assume the user is smart but busy, has no root access, may be on a " + "fresh Linux GPU VM, and may lose everything outside the mounted file volume. " + "Look for confusing copy, hidden manual steps, fragile installs, old CUDA or Python " + "versions, missing disk checks, slow downloads, and places where nvHive should heal " + "itself or clearly explain the next safe action." + ), + ), PersonaTemplate( role="Engineering Manager", expertise="team leadership, project management, hiring, engineering culture", @@ -278,7 +295,7 @@ def generate_agents( role=template.role, expertise=template.expertise, perspective=template.perspective, - system_prompt=_build_system_prompt(template, query), + system_prompt=template.system_prompt if template.system_prompt else _build_system_prompt(template, query), weight_boost=template.weight_boost, )) @@ -348,6 +365,11 @@ def generate_agents_with_llm( if t.role in ("Product Manager", "UX Designer", "Engineering Manager", "CEO / Business Strategist") ], + "product_resilience": [ + t for t in _PERSONA_POOL + if t.role in ("Underdog Student Advocate", "Product Manager", "UX Designer", + "DevOps/SRE Engineer", "QA/Test Engineer", "ML/AI Engineer") + ], "data": [ t for t in _PERSONA_POOL if t.role in ("Data Engineer", "Database Administrator", "ML/AI Engineer", diff --git a/nvh/integrations/auto_repair.py b/nvh/integrations/auto_repair.py new file mode 100644 index 0000000..ea3183c --- /dev/null +++ b/nvh/integrations/auto_repair.py @@ -0,0 +1,136 @@ +"""Safe rootless setup repair planning and execution.""" + +from __future__ import annotations + +from pathlib import Path +from typing import Any + +from nvh.integrations.catalog import load_setup_catalog +from nvh.integrations.comfyui import detect_comfyui, write_example_pack +from nvh.integrations.receipts import receipt_summary +from nvh.integrations.storage import ensure_storage, storage_layout, storage_status, write_env_file +from nvh.integrations.studio_packs import catalog_with_status + + +def _action( + action_id: str, + title: str, + *, + status: str, + summary: str, + safe_to_auto_run: bool, + action_type: str = "repair", + button_action_id: str | None = None, +) -> dict[str, Any]: + return { + "id": action_id, + "title": title, + "status": status, + "summary": summary, + "safe_to_auto_run": safe_to_auto_run, + "action_type": action_type, + "button_action_id": button_action_id or action_id, + } + + +def auto_repair_plan(home_dir: str | Path | None = None) -> dict[str, Any]: + """Return a queue of safe repairs plus explicit user actions.""" + storage = storage_status(home_dir=home_dir) + layout = storage.layout + comfy = detect_comfyui() + receipts = receipt_summary() + packs = catalog_with_status().get("packs", []) + pack_by_id = {pack.get("id"): pack for pack in packs} + agent_installed = bool(pack_by_id.get("agent-lab", {}).get("status", {}).get("installed")) + actions: list[dict[str, Any]] = [] + storage_auto_safe = storage.ok and storage.configured_by != "default" + + actions.append(_action( + "storage-env-file", + "Rebuild shell environment file", + status="queued" if storage_auto_safe else "needs-user" if storage.configured_by == "default" else "blocked", + summary=f"Write activation exports to {layout.home / 'nvh-env.sh'}.", + safe_to_auto_run=storage_auto_safe, + button_action_id="repair-workspace", + )) + actions.append(_action( + "catalog-cache", + "Verify setup catalog fallback", + status="queued", + summary="Ensure the setup catalog can load from cache or the bundled fallback.", + safe_to_auto_run=True, + button_action_id="repair-workspace", + )) + if comfy.get("installed") and not comfy.get("examples_installed"): + actions.append(_action( + "comfyui-examples", + "Repair ComfyUI starter examples", + status="queued", + summary="Rewrite the nvHive example manifest and README into the ComfyUI install.", + safe_to_auto_run=True, + button_action_id="repair-workspace", + )) + if receipts.get("unhealthy"): + actions.append(_action( + "receipts", + "Review unhealthy receipts", + status="needs-user", + summary=f"{receipts['unhealthy']} install receipt(s) point at missing files or launchers.", + safe_to_auto_run=False, + button_action_id="repair-receipts", + )) + if not agent_installed: + actions.append(_action( + "agent-lab", + "Install Local Agent Lab", + status="needs-user", + summary="Install the fuller local AI helper. This may download Python packages, so it asks first.", + safe_to_auto_run=False, + action_type="install", + button_action_id="agent-lab", + )) + + auto_count = sum(1 for action in actions if action["safe_to_auto_run"] and action["status"] == "queued") + return { + "summary": f"{auto_count} safe repair(s) can run automatically; downloads stay explicit.", + "auto_count": auto_count, + "needs_user_count": sum(1 for action in actions if action["status"] == "needs-user"), + "actions": actions, + } + + +def run_safe_repairs(home_dir: str | Path | None = None) -> dict[str, Any]: + """Run only idempotent repairs that do not install large packages or models.""" + plan = auto_repair_plan(home_dir=home_dir) + completed: list[dict[str, Any]] = [] + skipped: list[dict[str, Any]] = [] + errors: list[dict[str, Any]] = [] + storage = storage_status(home_dir=home_dir) + + for action in plan["actions"]: + if not action["safe_to_auto_run"] or action["status"] != "queued": + skipped.append({**action, "reason": "Requires user approval or is blocked."}) + continue + try: + if action["id"] == "storage-env-file": + ensure_storage(storage.layout.home, activate=False) + env_file = write_env_file(storage_layout(storage.layout.home)) + completed.append({**action, "result": str(env_file)}) + elif action["id"] == "catalog-cache": + loaded = load_setup_catalog(refresh=False) + completed.append({**action, "result": loaded.get("source")}) + elif action["id"] == "comfyui-examples": + examples_dir = write_example_pack() + completed.append({**action, "result": str(examples_dir)}) + else: + skipped.append({**action, "reason": "No safe repair handler."}) + except Exception as exc: + errors.append({**action, "error": str(exc)}) + + return { + "summary": f"{len(completed)} safe repair(s) completed, {len(skipped)} skipped, {len(errors)} error(s).", + "completed": completed, + "skipped": skipped, + "errors": errors, + "plan": plan, + } diff --git a/nvh/integrations/boot_preflight.py b/nvh/integrations/boot_preflight.py new file mode 100644 index 0000000..c35016b --- /dev/null +++ b/nvh/integrations/boot_preflight.py @@ -0,0 +1,330 @@ +"""Boot-time VM image preflight for rootless cloud GPU sessions. + +Cloud desktop images can change underneath a persistent user mount. This +module records a small host fingerprint at nvHive startup, compares it to the +previous boot stored under ``NVH_HOME``, and keeps the wizard focused on what +changed. +""" + +from __future__ import annotations + +import hashlib +import json +import os +from datetime import UTC, datetime +from pathlib import Path +from typing import Any + +from nvh.integrations.auto_repair import auto_repair_plan, run_safe_repairs +from nvh.integrations.compatibility import compatibility_report +from nvh.integrations.model_fit import model_fit_report +from nvh.integrations.mount_autopilot import mount_autopilot_report +from nvh.integrations.smoke_tests import smoke_test_report +from nvh.integrations.storage import storage_layout + +STATE_FILENAME = "boot-preflight.json" +STATE_SCHEMA_VERSION = 1 + +_FACT_LABELS = { + "distro": "Base OS", + "kernel": "Kernel", + "machine": "Architecture", + "libc": "glibc", + "python_version": "Python", + "python_strategy": "Python runtime", + "gpu_name": "GPU", + "gpu_memory_total_mb": "GPU memory", + "gpu_memory_free_mb": "GPU free framebuffer", + "gpu_compute_capability": "GPU compute capability", + "gpu_architecture": "GPU architecture", + "gpu_detection_status": "GPU detection", + "driver_version": "NVIDIA driver", + "cuda_version": "CUDA driver API", + "git_available": "Git", + "curl_available": "curl", + "tar_available": "tar", + "node_available": "Node.js", + "node_version": "Node.js version", + "npm_available": "npm", + "npm_version": "npm version", + "display_available": "Desktop display", + "storage_home": "NVH_HOME", + "storage_configured_by": "Storage source", + "storage_total_gb": "Storage capacity", + "storage_write_probe_ok": "Storage write probe", + "torch_profile": "PyTorch CUDA profile", +} + +_CRITICAL_FACTS = { + "distro", + "kernel", + "machine", + "libc", + "python_version", + "python_strategy", + "driver_version", + "cuda_version", + "gpu_memory_total_mb", + "gpu_compute_capability", + "gpu_detection_status", + "git_available", + "curl_available", + "tar_available", + "storage_home", + "storage_configured_by", + "storage_total_gb", + "storage_write_probe_ok", + "node_version", + "npm_version", + "torch_profile", +} + + +def _state_path(home_dir: str | Path | None = None) -> Path: + return storage_layout(home_dir).config_dir / STATE_FILENAME + + +def _read_state(home_dir: str | Path | None = None) -> dict[str, Any] | None: + path = _state_path(home_dir) + if not path.exists(): + return None + try: + data = json.loads(path.read_text(encoding="utf-8")) + except Exception: + return None + return data if isinstance(data, dict) else None + + +def _write_state(state: dict[str, Any], home_dir: str | Path | None = None) -> None: + path = _state_path(home_dir) + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(json.dumps(state, indent=2, sort_keys=True) + "\n", encoding="utf-8") + try: + path.chmod(0o600) + except Exception: + pass + + +def _available(value: str | None) -> bool: + return bool(value and str(value).strip()) + + +def _gb_value(value: Any) -> str: + if value in (None, ""): + return "" + try: + return str(round(float(value), 1)) + except (TypeError, ValueError): + return str(value) + + +def host_fingerprint(report: dict[str, Any]) -> dict[str, Any]: + """Extract stable boot facts from a compatibility report.""" + host = report.get("host", {}) + commands = host.get("commands", {}) + command_versions = host.get("command_versions", {}) + gpu = host.get("gpu", {}) + python = host.get("python", {}) + libc = host.get("libc", {}) + display = host.get("display", {}) + storage = host.get("storage", {}) + storage_layout_data = storage.get("layout", {}) + return { + "distro": host.get("distro", ""), + "kernel": host.get("kernel", ""), + "machine": host.get("machine", ""), + "libc": f"{libc.get('name', '')} {libc.get('version', '')}".strip(), + "python_version": python.get("version", ""), + "python_strategy": python.get("strategy", ""), + "gpu_name": gpu.get("name", ""), + "gpu_memory_total_mb": str(gpu.get("memory_total_mb", "")), + "gpu_memory_free_mb": str(gpu.get("memory_free_mb", "")), + "gpu_compute_capability": str(gpu.get("compute_capability", "")), + "gpu_architecture": str(gpu.get("architecture", "")), + "gpu_detection_status": str(gpu.get("detection_status", "")), + "driver_version": gpu.get("driver_version", ""), + "cuda_version": gpu.get("cuda_version", ""), + "git_available": _available(commands.get("git")), + "curl_available": _available(commands.get("curl")), + "tar_available": _available(commands.get("tar")), + "node_available": _available(commands.get("node")), + "node_version": command_versions.get("node", ""), + "npm_available": _available(commands.get("npm")), + "npm_version": command_versions.get("npm", ""), + "display_available": _available(display.get("DISPLAY")) or _available(display.get("WAYLAND_DISPLAY")), + "storage_home": storage_layout_data.get("home", ""), + "storage_configured_by": storage.get("configured_by", ""), + "storage_total_gb": _gb_value(storage.get("total_gb")), + "storage_write_probe_ok": str(storage.get("write_probe_ok", "")), + "torch_profile": str(report.get("recommended_torch_profile", "")), + } + + +def fingerprint_id(fingerprint: dict[str, Any]) -> str: + payload = json.dumps(fingerprint, sort_keys=True, separators=(",", ":")) + return hashlib.sha256(payload.encode("utf-8")).hexdigest()[:16] + + +def diff_fingerprints(previous: dict[str, Any] | None, current: dict[str, Any]) -> list[dict[str, Any]]: + """Return user-facing boot changes between two host fingerprints.""" + if not previous: + return [] + changes: list[dict[str, Any]] = [] + for key in sorted(set(previous) | set(current)): + before = previous.get(key) + after = current.get(key) + if before == after: + continue + severity = "required" if key in _CRITICAL_FACTS else "recommended" + changes.append( + { + "id": key, + "label": _FACT_LABELS.get(key, key.replace("_", " ").title()), + "before": str(before) if before not in (None, "") else "missing", + "after": str(after) if after not in (None, "") else "missing", + "severity": severity, + "detail": "Re-run compatibility checks before launching installed apps.", + } + ) + return changes + + +def _result_summary( + *, + first_run: bool, + changes: list[dict[str, Any]], + compatibility: dict[str, Any], +) -> str: + issue_count = int(compatibility.get("issue_count", 0) or 0) + blocked_count = int(compatibility.get("blocked_count", 0) or 0) + if first_run: + return f"Boot baseline captured with {issue_count} compatibility item(s)." + if changes: + return f"VM image changed in {len(changes)} place(s); {blocked_count} blocked compatibility item(s)." + if issue_count: + return f"VM image unchanged; {issue_count} compatibility item(s) still need attention." + return "VM image unchanged and app compatibility is ready." + + +def _agent_helper_status(compatibility: dict[str, Any]) -> dict[str, Any]: + agent_app = next( + (app for app in compatibility.get("apps", []) if app.get("id") == "agent-lab"), + {}, + ) + local_ready = agent_app.get("status") == "ready" + action_id = agent_app.get("recommended_action_id") or "agent-lab" + return { + "offline_helper_ready": True, + "local_agent_ready": local_ready, + "mode": "local-agent-ready" if local_ready else "offline-deterministic", + "recommended_action_id": None if local_ready else action_id, + "summary": ( + "Local agent helper is ready for guided setup." + if local_ready + else "Offline setup helper is active; install Local Agent Lab for the fuller AI assistant." + ), + "requirements": agent_app.get("requirements", []), + } + + +def run_boot_preflight(home_dir: str | Path | None = None) -> dict[str, Any]: + """Run and persist the boot preflight under the selected NVH_HOME.""" + previous_state = _read_state(home_dir) + previous_result = previous_state.get("last_result") if previous_state else None + previous_fingerprint = ( + previous_state.get("last_fingerprint") + if previous_state + else None + ) + + compatibility = compatibility_report(home_dir=home_dir) + current_fingerprint = host_fingerprint(compatibility) + current_id = fingerprint_id(current_fingerprint) + previous_id = fingerprint_id(previous_fingerprint) if previous_fingerprint else None + changes = diff_fingerprints(previous_fingerprint, current_fingerprint) + first_run = previous_fingerprint is None + checked_at = datetime.now(UTC).isoformat() + agent_helper = _agent_helper_status(compatibility) + mount_autopilot = mount_autopilot_report() + repair_plan = auto_repair_plan(home_dir=home_dir) + repair_result = None + if os.environ.get("NVH_BOOT_AUTO_REPAIR", "1").lower() not in {"0", "false", "off", "no"}: + repair_result = run_safe_repairs(home_dir=home_dir) + smoke_tests = smoke_test_report(home_dir=home_dir) + model_fit = model_fit_report(home_dir=home_dir) + + result = { + "schema_version": STATE_SCHEMA_VERSION, + "checked_at": checked_at, + "state_file": str(_state_path(home_dir)), + "first_run": first_run, + "changed": bool(changes), + "needs_attention": bool(changes) or not compatibility.get("ready", False), + "fingerprint_id": current_id, + "previous_fingerprint_id": previous_id, + "previous_checked_at": previous_result.get("checked_at") if isinstance(previous_result, dict) else None, + "summary": _result_summary(first_run=first_run, changes=changes, compatibility=compatibility), + "changes": changes, + "agent_helper": agent_helper, + "mount_autopilot": mount_autopilot, + "auto_repair": repair_result or repair_plan, + "smoke_tests": smoke_tests, + "model_fit": { + "summary": model_fit.get("summary"), + "detected_vram_gb": model_fit.get("detected_vram_gb"), + "recommended_queue_disk_gb": model_fit.get("recommended_queue_disk_gb"), + "storage_fits_queue": model_fit.get("storage_fits_queue"), + "recommended_ids": model_fit.get("recommended_ids", []), + }, + "compatibility": compatibility, + } + _write_state( + { + "schema_version": STATE_SCHEMA_VERSION, + "last_checked_at": checked_at, + "last_fingerprint": current_fingerprint, + "last_result": result, + }, + home_dir=home_dir, + ) + return result + + +def boot_preflight_status( + home_dir: str | Path | None = None, + *, + run_if_missing: bool = True, +) -> dict[str, Any]: + """Return the latest boot preflight, optionally running it once if absent.""" + state = _read_state(home_dir) + result = state.get("last_result") if state else None + if isinstance(result, dict): + return result + if run_if_missing: + return run_boot_preflight(home_dir=home_dir) + return { + "schema_version": STATE_SCHEMA_VERSION, + "checked_at": None, + "state_file": str(_state_path(home_dir)), + "first_run": True, + "changed": False, + "needs_attention": False, + "fingerprint_id": None, + "previous_fingerprint_id": None, + "previous_checked_at": None, + "summary": "Boot preflight has not run yet.", + "changes": [], + "agent_helper": { + "offline_helper_ready": True, + "local_agent_ready": False, + "mode": "offline-deterministic", + "recommended_action_id": "agent-lab", + "summary": "Offline setup helper is active; boot preflight has not checked Local Agent Lab yet.", + "requirements": [], + }, + "mount_autopilot": None, + "auto_repair": None, + "smoke_tests": None, + "model_fit": None, + "compatibility": None, + } diff --git a/nvh/integrations/catalog.py b/nvh/integrations/catalog.py new file mode 100644 index 0000000..c63c8a9 --- /dev/null +++ b/nvh/integrations/catalog.py @@ -0,0 +1,202 @@ +"""Remote setup catalog with bundled fallback. + +The catalog lets nvHive update recommended profiles, model picks, and ComfyUI +starter workflows between package releases. Network access is optional: the +WebUI and CLI always fall back to a bundled catalog. +""" + +from __future__ import annotations + +import json +import os +import time +from importlib import resources +from pathlib import Path +from typing import Any + +from nvh.integrations.storage import storage_layout + +DEFAULT_CATALOG_URL = ( + "https://raw.githubusercontent.com/thatcooperguy/nvHive/main/" + "nvh/catalog/nvhive-catalog.json" +) +CATALOG_ENV = "NVH_CATALOG_URL" +SCHEMA_VERSION = 1 + + +def _now() -> str: + return time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()) + + +def catalog_cache_dir(*, create: bool = True) -> Path: + root = storage_layout().cache_dir / "catalog" + if create: + root.mkdir(parents=True, exist_ok=True) + return root + + +def catalog_cache_path(*, create: bool = True) -> Path: + return catalog_cache_dir(create=create) / "nvhive-catalog.json" + + +def _validate_catalog(catalog: dict[str, Any]) -> dict[str, Any]: + if int(catalog.get("schema_version", 0)) != SCHEMA_VERSION: + raise ValueError("Unsupported catalog schema_version") + for key in ("profiles", "models", "packs", "comfyui_examples"): + if not isinstance(catalog.get(key), list): + raise ValueError(f"Catalog is missing list field: {key}") + return catalog + + +def _generated_fallback_catalog() -> dict[str, Any]: + from nvh.integrations.comfyui import examples_as_dicts + from nvh.integrations.studio_packs import ( + BLENDER_VERSION, + catalog_as_dicts, + model_catalog_as_dicts, + ) + + packs = catalog_as_dicts() + for pack in packs: + if pack.get("id") == "blender-creative": + pack["latest_version"] = BLENDER_VERSION + + return { + "schema_version": SCHEMA_VERSION, + "updated_at": _now(), + "channel": "generated-fallback", + "profiles": [ + { + "id": "student", + "title": "AI Starter", + "pack_ids": ["starter"], + "model_ids": ["gemma3-4b", "qwen3-8b", "nomic-embed-text"], + "description": "First-time local AI lab for chat, homework, coding help, GitHub, and the helper agent.", + }, + { + "id": "creator", + "title": "Graphics Creator Studio", + "pack_ids": ["creative", "comfy"], + "model_ids": ["gemma3-4b", "llava-7b"], + "description": "ComfyUI, Blender, image/video workflows, and vision helpers for graphics projects.", + }, + { + "id": "music", + "title": "Music Producer Studio", + "pack_ids": ["music"], + "model_ids": ["gemma3-4b"], + "description": "AI music generation, stems, transcription, and rootless DAW helpers.", + }, + { + "id": "agent", + "title": "Agent Builder", + "pack_ids": ["agents"], + "model_ids": ["qwen25-coder-7b", "nomic-embed-text"], + "description": "Local agent libraries, coding model, and embeddings.", + }, + { + "id": "full", + "title": "Power User Workstation", + "pack_ids": ["all"], + "model_ids": ["recommended"], + "description": "Everything nvHive can install without root access, guarded by host checks.", + }, + ], + "packs": packs, + "models": model_catalog_as_dicts(), + "comfyui_examples": examples_as_dicts(), + } + + +def bundled_catalog() -> dict[str, Any]: + try: + payload = ( + resources.files("nvh.catalog") + .joinpath("nvhive-catalog.json") + .read_text(encoding="utf-8") + ) + return _validate_catalog(json.loads(payload)) + except Exception: + return _generated_fallback_catalog() + + +def _read_cached_catalog() -> dict[str, Any] | None: + path = catalog_cache_path(create=False) + if not path.exists(): + return None + try: + return _validate_catalog(json.loads(path.read_text(encoding="utf-8"))) + except Exception: + return None + + +def _write_cached_catalog(catalog: dict[str, Any]) -> None: + path = catalog_cache_path(create=True) + tmp = path.with_suffix(".json.tmp") + tmp.write_text(json.dumps(catalog, indent=2, sort_keys=True) + "\n", encoding="utf-8") + tmp.replace(path) + + +def _fetch_remote_catalog(url: str, timeout: float) -> dict[str, Any]: + import httpx + + response = httpx.get(url, follow_redirects=True, timeout=timeout) + response.raise_for_status() + return _validate_catalog(response.json()) + + +def load_setup_catalog( + *, + refresh: bool = False, + url: str | None = None, + timeout: float = 5.0, +) -> dict[str, Any]: + """Load setup catalog from remote, cache, or bundled fallback.""" + catalog_url = url or os.environ.get(CATALOG_ENV) or DEFAULT_CATALOG_URL + errors: list[str] = [] + + if refresh: + try: + catalog = _fetch_remote_catalog(catalog_url, timeout) + catalog["_cached_at"] = _now() + _write_cached_catalog(catalog) + return { + "source": "remote", + "url": catalog_url, + "catalog": catalog, + "error": None, + } + except Exception as exc: + errors.append(str(exc)) + + cached = _read_cached_catalog() + if cached is not None: + return { + "source": "cache", + "url": catalog_url, + "catalog": cached, + "error": "; ".join(errors) if errors else None, + } + + return { + "source": "bundled", + "url": catalog_url, + "catalog": bundled_catalog(), + "error": "; ".join(errors) if errors else None, + } + + +def catalog_status(*, refresh: bool = False) -> dict[str, Any]: + loaded = load_setup_catalog(refresh=refresh) + catalog = loaded["catalog"] + return { + "source": loaded["source"], + "url": loaded["url"], + "error": loaded["error"], + "schema_version": catalog.get("schema_version"), + "updated_at": catalog.get("updated_at"), + "profile_count": len(catalog.get("profiles", [])), + "pack_count": len(catalog.get("packs", [])), + "model_count": len(catalog.get("models", [])), + "comfyui_example_count": len(catalog.get("comfyui_examples", [])), + } diff --git a/nvh/integrations/comfyui.py b/nvh/integrations/comfyui.py index 92c0fb9..30a64b2 100644 --- a/nvh/integrations/comfyui.py +++ b/nvh/integrations/comfyui.py @@ -610,6 +610,28 @@ async def install_comfyui( "message": "Installed nvHive ComfyUI example pack", "examples_dir": str(examples_dir), } + try: + from nvh.integrations.receipts import write_receipt + + write_receipt( + kind="comfyui", + item_id="workspace", + title="ComfyUI Workspace", + install_path=app_dir, + source_urls=[COMFYUI_REPO_URL], + files=[ + str(examples_dir / "examples.json"), + str(examples_dir / "README.md"), + ], + metadata={ + "torch_profile": torch_profile, + "venv_python": str(venv_python), + "examples_dir": str(examples_dir), + "status": detect_comfyui(), + }, + ) + except Exception: + pass yield { "event": "complete", diff --git a/nvh/integrations/compatibility.py b/nvh/integrations/compatibility.py new file mode 100644 index 0000000..a9ba6da --- /dev/null +++ b/nvh/integrations/compatibility.py @@ -0,0 +1,518 @@ +"""Host and application compatibility checks for nvWizard. + +The compatibility layer separates problems nvHive can fix rootlessly from +problems that require a different base image, GPU session, driver, or OS. +""" + +from __future__ import annotations + +import json +import os +import platform +import shutil +import socket +import subprocess +import sys +from dataclasses import asdict, dataclass, field +from pathlib import Path +from typing import Any + +from nvh.integrations.runtime import runtime_status +from nvh.integrations.storage import storage_status +from nvh.integrations.studio_packs import ( + BLENDER_VERSION, + _docker_status, + _node_runtime_status, + catalog_with_status, + model_catalog_with_status, +) +from nvh.utils.gpu import detect_gpu_status, gpu_architecture_info + + +@dataclass(frozen=True) +class HostFact: + """One detected host capability or dependency.""" + + id: str + label: str + value: str + status: str + severity: str = "info" + detail: str = "" + + def as_dict(self) -> dict[str, Any]: + return asdict(self) + + +@dataclass(frozen=True) +class CompatibilityRequirement: + """One app requirement and how to satisfy it.""" + + id: str + label: str + status: str + detail: str + fix_action_id: str | None = None + rootless_fix_available: bool = False + + def as_dict(self) -> dict[str, Any]: + return asdict(self) + + +@dataclass(frozen=True) +class AppCompatibility: + """Compatibility summary for one nvHive-managed app or pack.""" + + id: str + title: str + category: str + status: str + severity: str + summary: str + recommended_action_id: str | None = None + rootless_fix_available: bool = False + requirements: list[CompatibilityRequirement] = field(default_factory=list) + notes: list[str] = field(default_factory=list) + + def as_dict(self) -> dict[str, Any]: + data = asdict(self) + data["requirements"] = [req.as_dict() for req in self.requirements] + return data + + +def _parse_version(value: str | None) -> tuple[int, ...]: + if not value: + return () + parts: list[int] = [] + for chunk in str(value).split("."): + digits = "".join(ch for ch in chunk if ch.isdigit()) + if digits == "": + break + parts.append(int(digits)) + return tuple(parts) + + +def _version_at_least(value: str | None, minimum: str) -> bool: + current = _parse_version(value) + target = _parse_version(minimum) + if not current: + return False + width = max(len(current), len(target)) + return current + (0,) * (width - len(current)) >= target + (0,) * (width - len(target)) + + +def _which(command: str) -> str | None: + return shutil.which(command) + + +def _command_version(command: str, *args: str, timeout: float = 4.0) -> str: + executable = _which(command) + if not executable: + return "" + try: + result = subprocess.run( + [executable, *args], + capture_output=True, + text=True, + timeout=timeout, + ) + except Exception: + return "" + return (result.stdout or result.stderr).strip().splitlines()[0] if (result.stdout or result.stderr) else "" + + +def _read_os_release() -> dict[str, str]: + path = Path("/etc/os-release") + if not path.exists(): + return {} + data: dict[str, str] = {} + try: + for line in path.read_text(encoding="utf-8", errors="ignore").splitlines(): + if "=" not in line: + continue + key, value = line.split("=", 1) + data[key] = value.strip().strip('"') + except Exception: + return {} + return data + + +def _nvidia_smi_query() -> dict[str, str]: + status = detect_gpu_status() + gpus = status.get("gpus", []) + if not gpus: + return { + "detection_status": str(status.get("status") or "not-detected"), + "detection_source": str(status.get("source") or "none"), + "detection_issues": json.dumps(status.get("issues", [])), + "device_files_present": str(bool(status.get("device_files_present", False))).lower(), + } + gpu = gpus[0] + arch = gpu_architecture_info(gpu) + return { + "name": gpu.name, + "memory_total_mb": str(gpu.vram_mb), + "memory_free_mb": str(gpu.memory_free_mb), + "memory_used_mb": str(gpu.memory_used_mb), + "driver_version": gpu.driver_version, + "cuda_version": gpu.cuda_version, + "compute_capability": ".".join(str(part) for part in arch["compute_capability"]), + "compute_capability_source": str(arch["compute_capability_source"]), + "architecture": str(arch["architecture"]), + "architecture_heuristic": str(bool(arch["heuristic"])).lower(), + "detection_status": str(status.get("status") or "ready"), + "detection_source": str(status.get("source") or ""), + "detection_issues": json.dumps(status.get("issues", [])), + "device_files_present": str(bool(status.get("device_files_present", False))).lower(), + } + + +def _nvidia_cuda_version() -> str: + if not _which("nvidia-smi"): + return "" + try: + result = subprocess.run( + ["nvidia-smi"], + capture_output=True, + text=True, + timeout=8, + ) + except Exception: + return "" + text = result.stdout or "" + marker = "CUDA Version:" + if marker not in text: + return "" + return text.split(marker, 1)[1].split("|", 1)[0].strip() + + +def _port_open(port: int, host: str = "127.0.0.1") -> bool: + try: + with socket.create_connection((host, port), timeout=0.4): + return True + except OSError: + return False + + +def _host_facts() -> dict[str, Any]: + os_release = _read_os_release() + libc_name, libc_version = platform.libc_ver() + nvidia = _nvidia_smi_query() + runtime = runtime_status() + storage = storage_status(min_free_gb=20) + py_version = f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}" + return { + "platform": sys.platform, + "system": platform.system(), + "machine": platform.machine(), + "kernel": platform.release(), + "distro": os_release.get("PRETTY_NAME") or os_release.get("NAME") or platform.platform(), + "libc": {"name": libc_name, "version": libc_version}, + "python": { + "executable": sys.executable, + "version": py_version, + "venv_available": runtime.venv_available, + "pip_available": runtime.pip_available, + "strategy": runtime.strategy, + }, + "commands": { + "git": _which("git"), + "curl": _which("curl"), + "tar": _which("tar"), + "node": _which("node"), + "npm": _which("npm"), + "docker": _which("docker"), + "nvidia-smi": _which("nvidia-smi"), + }, + "command_versions": { + "git": _command_version("git", "--version"), + "node": _command_version("node", "--version"), + "npm": _command_version("npm", "--version"), + "docker": _command_version("docker", "--version"), + }, + "gpu": nvidia, + "display": { + "DISPLAY": os.environ.get("DISPLAY", ""), + "WAYLAND_DISPLAY": os.environ.get("WAYLAND_DISPLAY", ""), + }, + "ports": { + "ollama_11434": _port_open(11434), + "comfyui_8188": _port_open(8188), + }, + "storage": storage.as_dict(), + } + + +def _fact_list(host: dict[str, Any]) -> list[HostFact]: + commands = host["commands"] + gpu = host["gpu"] + gpu_ready = bool(gpu.get("name")) + display_ready = bool(host["display"].get("DISPLAY") or host["display"].get("WAYLAND_DISPLAY")) + return [ + HostFact("os", "Base OS", f"{host['distro']} / {host['kernel']}", "detected"), + HostFact("arch", "Architecture", str(host["machine"]), "ok"), + HostFact("python", "Python", host["python"]["version"], "ok" if _version_at_least(host["python"]["version"], "3.11") else "blocked", "required"), + HostFact("pip", "pip", "available" if host["python"]["pip_available"] else "missing", "ok" if host["python"]["pip_available"] else "fixable", "recommended"), + HostFact("venv", "Python venv", "available" if host["python"]["venv_available"] else "missing", "ok" if host["python"]["venv_available"] else "fixable", "recommended"), + HostFact("git", "Git", commands.get("git") or "missing", "ok" if commands.get("git") else "blocked", "required"), + HostFact("curl", "curl", commands.get("curl") or "missing", "ok" if commands.get("curl") else "blocked", "required"), + HostFact("node", "Node.js", host["command_versions"].get("node") or "rootless install available", "ok" if _node_runtime_status().get("ready") else "fixable", "recommended"), + HostFact("docker", "Docker runtime", host["command_versions"].get("docker") or "missing", "ok" if _docker_status().get("ready") else "degraded", "optional", "Required only for NemoClaw/OpenShell sandboxes."), + HostFact("nvidia-smi", "NVIDIA driver", gpu.get("driver_version") or gpu.get("detection_status", "not detected"), "ok" if gpu_ready else "degraded", "recommended"), + HostFact("cuda", "CUDA driver API", gpu.get("cuda_version", "unknown"), "ok" if gpu.get("cuda_version") else "degraded", "recommended"), + HostFact("display", "Linux desktop display", "available" if display_ready else "not detected", "ok" if display_ready else "degraded", "optional"), + HostFact("storage", "Persistent NVH_HOME", host["storage"]["layout"]["home"], "ok" if host["storage"]["ok"] and host["storage"]["configured_by"] != "default" else "fixable", "required"), + ] + + +def _req( + req_id: str, + label: str, + ok: bool, + detail: str, + *, + fix_action_id: str | None = None, + rootless_fix_available: bool = False, + blocked: bool = False, +) -> CompatibilityRequirement: + if ok: + status = "ok" + elif blocked: + status = "blocked" + elif rootless_fix_available: + status = "fixable" + else: + status = "warning" + return CompatibilityRequirement( + id=req_id, + label=label, + status=status, + detail=detail, + fix_action_id=fix_action_id, + rootless_fix_available=rootless_fix_available, + ) + + +def _overall( + app_id: str, + title: str, + category: str, + requirements: list[CompatibilityRequirement], + *, + recommended_action_id: str | None = None, + notes: list[str] | None = None, +) -> AppCompatibility: + if any(req.status == "blocked" for req in requirements): + status = "blocked" + severity = "required" + summary = "Needs a different base image, driver, OS package, or admin-provided dependency." + elif any(req.status == "fixable" for req in requirements): + status = "fixable" + severity = "recommended" + summary = "nvHive can repair or install the missing pieces without root." + elif any(req.status == "warning" for req in requirements): + status = "degraded" + severity = "optional" + summary = "Can run, but some capabilities may be slower or limited." + else: + status = "ready" + severity = "info" + summary = "Ready on this host." + return AppCompatibility( + id=app_id, + title=title, + category=category, + status=status, + severity=severity, + summary=summary, + recommended_action_id=recommended_action_id if status != "ready" else None, + rootless_fix_available=any(req.rootless_fix_available for req in requirements), + requirements=requirements, + notes=notes or [], + ) + + +def recommended_torch_profile(cuda_version: str | None) -> str: + """Pick the safest ComfyUI torch profile from the driver-reported CUDA API.""" + if _version_at_least(cuda_version, "13.0"): + return "nvidia-cu130" + if _version_at_least(cuda_version, "12.1"): + return "nvidia-cu121" + return "cpu" + + +def compatibility_report(home_dir: str | Path | None = None) -> dict[str, Any]: + """Return host facts and app compatibility recommendations.""" + host = _host_facts() + if home_dir: + host["storage"] = storage_status(home_dir=home_dir, min_free_gb=20).as_dict() + gpu = host["gpu"] + gpu_ready = bool(gpu.get("name")) + commands = host["commands"] + py = host["python"] + storage = host["storage"] + is_linux = host["system"] == "Linux" + arch = str(host["machine"]).lower() + display_ready = bool(host["display"].get("DISPLAY") or host["display"].get("WAYLAND_DISPLAY")) + cuda_profile = recommended_torch_profile(gpu.get("cuda_version")) + model_status = model_catalog_with_status() + pack_status = catalog_with_status() + pack_by_id = {pack.get("id"): pack for pack in pack_status.get("packs", [])} + node_status = _node_runtime_status() + docker_status = _docker_status() + recommended_models = model_status.get("recommended_ids", []) + missing_recommended_models = [ + model["id"] for model in model_status.get("models", []) + if model.get("recommended") and not model.get("installed") + ] + + def _pack_installed(pack_id: str) -> bool: + status = pack_by_id.get(pack_id, {}).get("status", {}) + return bool(status.get("installed")) + + apps = [ + _overall( + "persistent-storage", + "Persistent NVH_HOME", + "foundation", + [ + _req( + "storage", + "Mounted storage", + bool(storage["ok"] and storage["configured_by"] != "default"), + "Use a mounted volume that survives cloud desktop reconnects.", + fix_action_id="storage", + rootless_fix_available=True, + ) + ], + recommended_action_id="storage", + ), + _overall( + "rootless-ollama", + "Rootless Ollama Runtime", + "runtime", + [ + _req("linux", "Linux host", is_linux, "The rootless Ollama pack targets Linux cloud desktops.", blocked=not is_linux), + _req("arch", "CPU architecture", arch in {"x86_64", "amd64", "aarch64", "arm64"}, f"Detected {host['machine']}.", blocked=arch not in {"x86_64", "amd64", "aarch64", "arm64"}), + _req("curl", "curl", bool(commands.get("curl")), "Needed to download the Ollama bundle.", blocked=not commands.get("curl")), + _req("tar", "tar", bool(commands.get("tar")), "Needed to extract the Ollama bundle.", blocked=not commands.get("tar")), + _req("port", "Port 11434", not host["ports"]["ollama_11434"] or model_status.get("ollama_running"), "Ollama uses localhost:11434."), + ], + recommended_action_id="rootless-ollama", + ), + _overall( + "local-models", + "Recommended Local Models", + "model", + [ + _req("ollama", "Ollama runtime", bool(model_status.get("ollama_available")), "Required for local LLM downloads.", fix_action_id="rootless-ollama", rootless_fix_available=True), + _req("gpu", "NVIDIA GPU", gpu_ready, "GPU acceleration is strongly recommended; CPU fallback is slower."), + _req("models", "Recommended models", not missing_recommended_models, f"{len(missing_recommended_models)} recommended model(s) missing.", fix_action_id="starter-models", rootless_fix_available=True), + ], + recommended_action_id="starter-models", + notes=[f"Recommended model ids: {', '.join(recommended_models)}"] if recommended_models else [], + ), + _overall( + "comfyui", + "ComfyUI Visual Workspace", + "creative", + [ + _req("git", "Git", bool(commands.get("git")), "Required to clone/update ComfyUI.", blocked=not commands.get("git")), + _req("python", "Python 3.11+", _version_at_least(py["version"], "3.11"), f"Detected Python {py['version']}.", blocked=not _version_at_least(py["version"], "3.11")), + _req("venv", "Python venv/pip", bool(py["venv_available"] and py["pip_available"]), f"Runtime strategy: {py['strategy']}.", fix_action_id="runtime-fallback", rootless_fix_available=True), + _req("torch", "PyTorch CUDA profile", cuda_profile != "cpu", f"Recommended profile: {cuda_profile}. CPU fallback is available."), + _req("storage", "Persistent storage", bool(storage["ok"]), "ComfyUI and model caches are large.", fix_action_id="storage", rootless_fix_available=True), + ], + recommended_action_id="comfyui", + notes=[f"Recommended torch profile for this host: {cuda_profile}."], + ), + _overall( + "blender-creative", + "Blender Creative Studio", + "creative", + [ + _req("linux-x64", "Linux x64 desktop", is_linux and arch in {"x86_64", "amd64"}, "The bundled Blender pack currently targets Linux x64.", blocked=not (is_linux and arch in {"x86_64", "amd64"})), + _req("display", "Desktop display", display_ready, "Blender needs X11 or Wayland for interactive launch."), + _req("glibc", "glibc", not is_linux or _version_at_least(host["libc"]["version"], "2.31"), f"Detected {host['libc']['name']} {host['libc']['version'] or 'unknown'}.", blocked=is_linux and bool(host["libc"]["version"]) and not _version_at_least(host["libc"]["version"], "2.31")), + _req("storage", "Persistent app storage", bool(storage["ok"]), "Blender installs under NVH_HOME/apps/blender.", fix_action_id="storage", rootless_fix_available=True), + ], + recommended_action_id="creative-tools", + notes=[f"Bundled Blender version: {BLENDER_VERSION}."], + ), + _overall( + "agent-lab", + "Local Agent Lab", + "agent", + [ + _req("pack", "Agent lab pack", _pack_installed("agent-lab"), "Installs the local agent helper environment under NVH_HOME.", fix_action_id="agent-lab", rootless_fix_available=True), + _req("python", "Python 3.11+", _version_at_least(py["version"], "3.11"), f"Detected Python {py['version']}.", blocked=not _version_at_least(py["version"], "3.11")), + _req("venv", "Python venv/pip", bool(py["venv_available"] and py["pip_available"]), f"Runtime strategy: {py['strategy']}.", fix_action_id="runtime-fallback", rootless_fix_available=True), + _req("storage", "Persistent workspace", bool(storage["ok"]), "Agent packages install under NVH_HOME/studio.", fix_action_id="storage", rootless_fix_available=True), + ], + recommended_action_id="agent-lab", + ), + _overall( + "claw-agents", + "OpenClaw and NVIDIA NemoClaw", + "agent", + [ + _req("linux", "Linux desktop session", is_linux, "OpenClaw/NemoClaw packs are optimized for Linux cloud desktops.", blocked=not is_linux), + _req("node", "Node.js 22.16+ and npm 10+", bool(node_status.get("ready")), f"Node={node_status.get('node_version') or 'missing'}, npm={node_status.get('npm_version') or 'missing'}.", fix_action_id="claw-agents", rootless_fix_available=bool(node_status.get("can_auto_install"))), + _req("openclaw", "OpenClaw pack", _pack_installed("openclaw-agent"), "Simple self-hosted agent platform install.", fix_action_id="claw-agents", rootless_fix_available=True), + _req("docker", "Docker for NemoClaw", bool(docker_status.get("ready")), docker_status.get("detail", "Docker must work without sudo.")), + _req("storage", "Persistent workspace", bool(storage["ok"]), "Claw workspaces install under NVH_HOME/studio.", fix_action_id="storage", rootless_fix_available=True), + ], + recommended_action_id="claw-agents", + notes=[ + "OpenClaw is the default rootless agent path.", + "NemoClaw remains optional until Docker/OpenShell is available without sudo.", + ], + ), + _overall( + "game-dev-lab", + "Game Dev Lab", + "game", + [ + _req("python", "Python 3.11+", _version_at_least(py["version"], "3.11"), f"Detected Python {py['version']}.", blocked=not _version_at_least(py["version"], "3.11")), + _req("display", "Desktop display", display_ready, "Interactive samples need a display; headless asset generation can still work."), + _req("storage", "Persistent workspace", bool(storage["ok"]), "Game projects install under NVH_HOME/studio.", fix_action_id="storage", rootless_fix_available=True), + ], + recommended_action_id="creative-tools", + ), + _overall( + "music-producer-lab", + "Music Producer Studio", + "music", + [ + _req("linux", "Linux desktop session", is_linux, "ACE-Step and AppImage helpers are optimized for Linux cloud desktops.", blocked=not is_linux), + _req("python", "Python 3.11+", _version_at_least(py["version"], "3.11"), f"Detected Python {py['version']}.", blocked=not _version_at_least(py["version"], "3.11")), + _req("venv", "Python venv/pip", bool(py["venv_available"] and py["pip_available"]), f"Runtime strategy: {py['strategy']}.", fix_action_id="runtime-fallback", rootless_fix_available=True), + _req("git", "Git for ACE-Step", shutil.which("git") is not None, "ACE-Step is cloned from the official GitHub repository.", blocked=shutil.which("git") is None), + _req("storage", "Persistent music workspace", bool(storage["ok"]), "Music tools and model caches install under NVH_HOME/studio.", fix_action_id="storage", rootless_fix_available=True), + ], + recommended_action_id="music-tools", + notes=[ + "ACE-Step handles AI music generation; the audio lab handles stems, transcription, and cleanup.", + "NVIDIA GPU acceleration depends on the driver/CUDA exposed by the cloud image.", + ], + ), + ] + + issue_count = sum(1 for app in apps if app.status != "ready") + blocked_count = sum(1 for app in apps if app.status == "blocked") + fixable_count = sum(1 for app in apps if app.rootless_fix_available and app.status != "ready") + return { + "summary": ( + "Host is ready" + if issue_count == 0 + else f"{issue_count} app/profile compatibility item(s) need attention" + ), + "ready": issue_count == 0, + "issue_count": issue_count, + "blocked_count": blocked_count, + "rootless_fixable_count": fixable_count, + "recommended_torch_profile": cuda_profile, + "host": host, + "facts": [fact.as_dict() for fact in _fact_list(host)], + "apps": [app.as_dict() for app in apps], + } diff --git a/nvh/integrations/diagnostics.py b/nvh/integrations/diagnostics.py new file mode 100644 index 0000000..5eaf8e1 --- /dev/null +++ b/nvh/integrations/diagnostics.py @@ -0,0 +1,212 @@ +"""Redacted diagnostics reports for no-root setup support.""" + +from __future__ import annotations + +import json +import os +import platform +import re +import sys +from collections.abc import Callable +from datetime import UTC, datetime +from pathlib import Path +from typing import Any + +from nvh.integrations.storage import storage_layout, storage_status +from nvh.utils.logging import get_request_id + +ENV_ALLOWLIST = [ + "NVH_HOME", + "NVH_LOGS", + "NVH_RUNTIME_HOME", + "NVH_APPS_HOME", + "NVH_WEB_HOME", + "NVH_STUDIO_HOME", + "COMFYUI_HOME", + "OLLAMA_MODELS", + "HIVE_CONFIG_HOME", + "HIVE_LOG_LEVEL", + "HIVE_LOG_FORMAT", + "NVH_BOOT_PREFLIGHT", + "NVH_USE_BINARY", +] + +SECRET_KEY_RE = re.compile(r"(api[_-]?key|token|secret|password|authorization|bearer)", re.I) +SECRET_VALUE_RE = re.compile( + r"(sk-[A-Za-z0-9_\-]{8,}|Bearer\s+[A-Za-z0-9._\-]{8,}|gh[pousr]_[A-Za-z0-9_]{8,})", + re.I, +) + + +def _now() -> str: + return datetime.now(UTC).isoformat() + + +def _redact_text(value: str) -> str: + return SECRET_VALUE_RE.sub("[redacted]", value) + + +def _redact(key: str, value: Any) -> Any: + if value is None: + return None + if SECRET_KEY_RE.search(key): + return "[redacted]" + if isinstance(value, str): + return _redact_text(value) + if isinstance(value, list): + return [_redact(key, item) for item in value] + if isinstance(value, dict): + return {str(k): _redact(str(k), v) for k, v in value.items()} + return value + + +def _safe_call(label: str, fn: Callable[[], Any]) -> dict[str, Any]: + try: + return {"ok": True, "data": _redact(label, fn())} + except Exception as exc: + return { + "ok": False, + "error": { + "type": type(exc).__name__, + "message": _redact_text(str(exc)), + }, + } + + +def _candidate_log_files(logs_dir: Path) -> list[Path]: + paths: list[Path] = [] + explicit = os.environ.get("HIVE_LOG_FILE") + if explicit: + paths.append(Path(explicit).expanduser()) + try: + paths.extend(sorted(logs_dir.glob("*.log"), key=lambda p: p.stat().st_mtime, reverse=True)) + except Exception: + pass + + seen: set[str] = set() + unique: list[Path] = [] + for path in paths: + key = str(path) + if key in seen: + continue + seen.add(key) + unique.append(path) + if len(unique) >= 5: + break + return unique + + +def _tail_log_file(path: Path, *, max_lines: int) -> list[str]: + try: + lines = path.read_text(encoding="utf-8", errors="replace").splitlines() + except Exception: + return [] + interesting = [ + _redact_text(line) + for line in lines[-500:] + if any(marker in line.lower() for marker in ("error", "warning", "failed", "traceback")) + ] + return interesting[-max_lines:] + + +def _recent_log_entries(logs_dir: Path, *, max_lines: int) -> list[dict[str, Any]]: + entries: list[dict[str, Any]] = [] + remaining = max(0, max_lines) + for path in _candidate_log_files(logs_dir): + if remaining <= 0: + break + lines = _tail_log_file(path, max_lines=remaining) + if not lines: + continue + entries.append( + { + "path": str(path), + "lines": lines, + } + ) + remaining -= len(lines) + return entries + + +def diagnostics_report( + home_dir: str | Path | None = None, + *, + request_id: str | None = None, + include_logs: bool = True, + log_lines: int = 80, +) -> dict[str, Any]: + """Build a redacted support report that is safe to copy from the wizard.""" + checked_at = _now() + layout = storage_layout(home_dir) + request_id = request_id or get_request_id() + report_id = f"diag-{checked_at.replace(':', '').replace('.', '-')[:19]}" + if request_id: + report_id = f"{report_id}-{request_id[-8:]}" + + storage = _safe_call("storage", lambda: storage_status(home_dir=home_dir).as_dict()) + + def _readiness() -> dict[str, Any]: + from nvh.integrations.production_readiness import production_readiness_report + + return production_readiness_report(home_dir=home_dir) + + def _jobs() -> dict[str, Any]: + from nvh.integrations.jobs import list_jobs + + jobs = list_jobs(limit=8) + failed = [job for job in jobs if job.get("status") in {"failed", "interrupted"}] + return { + "count": len(jobs), + "failed_or_interrupted": len(failed), + "jobs": jobs, + } + + def _receipts() -> dict[str, Any]: + from nvh.integrations.receipts import receipt_summary + + return receipt_summary() + + diagnostics = { + "report_id": report_id, + "checked_at": checked_at, + "request_id": request_id, + "summary": "Redacted rootless setup diagnostics for nvHive support.", + "environment": { + "platform": platform.platform(), + "system": platform.system(), + "release": platform.release(), + "machine": platform.machine(), + "python": platform.python_version(), + "executable": sys.executable, + "cwd": str(Path.cwd()), + "env": { + key: _redact(key, os.environ.get(key)) + for key in ENV_ALLOWLIST + if os.environ.get(key) is not None + }, + }, + "paths": { + "home": str(layout.home), + "logs": str(layout.logs_dir), + "jobs": str(layout.home / "jobs"), + "config": str(layout.config_dir), + "models": str(layout.models_dir), + "apps": str(layout.apps_dir), + }, + "checks": { + "storage": storage, + "production_readiness": _safe_call("production_readiness", _readiness), + "jobs": _safe_call("jobs", _jobs), + "receipts": _safe_call("receipts", _receipts), + }, + "logs": { + "included": include_logs, + "files": [str(path) for path in _candidate_log_files(layout.logs_dir)], + "recent": _recent_log_entries(layout.logs_dir, max_lines=max(0, min(log_lines, 200))) + if include_logs + else [], + }, + } + + # Last-pass redaction catches nested messages from third-party tools. + return json.loads(json.dumps(_redact("diagnostics", diagnostics), default=str)) diff --git a/nvh/integrations/jobs.py b/nvh/integrations/jobs.py index 8150315..afd451f 100644 --- a/nvh/integrations/jobs.py +++ b/nvh/integrations/jobs.py @@ -4,6 +4,7 @@ import asyncio import json +import logging import uuid from collections.abc import AsyncIterator, Callable from datetime import UTC, datetime @@ -15,6 +16,7 @@ TERMINAL_STATUSES = {"complete", "failed", "canceled", "interrupted"} RUNNING_STATUSES = {"queued", "running"} _TASKS: dict[str, asyncio.Task[None]] = {} +logger = logging.getLogger(__name__) def _now() -> str: @@ -81,6 +83,10 @@ def create_job( "events_path": str(_events_path(job_id)), } _events_path(job_id).write_text("", encoding="utf-8") + logger.info( + "Setup job queued", + extra={"job_id": job_id, "kind": kind}, + ) return _write_job(job) @@ -189,12 +195,25 @@ def append_event(job_id: str, payload: dict[str, Any]) -> dict[str, Any]: job["status"] = "complete" job["progress"] = 100 job["completed_at"] = now + logger.info( + "Setup job complete", + extra={"job_id": job_id, "kind": job.get("kind", "")}, + ) elif payload.get("event") == "error" or payload.get("status") == "failed": job["status"] = "failed" job["completed_at"] = now + logger.error( + "Setup job failed: %s", + event["message"] or "unknown error", + extra={"job_id": job_id, "kind": job.get("kind", "")}, + ) elif payload.get("event") == "canceled" or payload.get("status") == "canceled": job["status"] = "canceled" job["completed_at"] = now + logger.info( + "Setup job canceled", + extra={"job_id": job_id, "kind": job.get("kind", "")}, + ) else: job["status"] = "running" _write_job(job) @@ -214,6 +233,10 @@ def reconcile_job(job: dict[str, Any]) -> dict[str, Any]: ) job["updated_at"] = _now() job["completed_at"] = job["updated_at"] + logger.warning( + "Setup job marked interrupted after server restart", + extra={"job_id": job.get("id", ""), "kind": job.get("kind", "")}, + ) return _write_job(job) @@ -268,6 +291,10 @@ async def _consume_job_events( ) raise except Exception as exc: + logger.exception( + "Setup job crashed", + extra={"job_id": job_id}, + ) append_event( job_id, { diff --git a/nvh/integrations/mission_control.py b/nvh/integrations/mission_control.py new file mode 100644 index 0000000..09d15f6 --- /dev/null +++ b/nvh/integrations/mission_control.py @@ -0,0 +1,99 @@ +"""Mission timeline for the nvWizard setup journey.""" + +from __future__ import annotations + +from typing import Any + +from nvh.integrations.auto_repair import auto_repair_plan +from nvh.integrations.boot_preflight import boot_preflight_status +from nvh.integrations.model_fit import model_fit_report +from nvh.integrations.mount_autopilot import mount_autopilot_report +from nvh.integrations.receipts import receipt_summary +from nvh.integrations.smoke_tests import smoke_test_report + + +def _stage(stage_id: str, title: str, status: str, summary: str, action_id: str | None = None) -> dict[str, Any]: + return { + "id": stage_id, + "title": title, + "status": status, + "summary": summary, + "action_id": action_id, + } + + +def mission_control_report(home_dir: str | None = None) -> dict[str, Any]: + """Return one student-friendly setup timeline.""" + boot = boot_preflight_status(home_dir=home_dir, run_if_missing=True) + mount = mount_autopilot_report() + repairs = auto_repair_plan(home_dir=home_dir) + smoke = smoke_test_report(home_dir=home_dir) + models = model_fit_report(home_dir=home_dir) + receipts = receipt_summary() + compatibility = boot.get("compatibility") or {} + agent = boot.get("agent_helper") or {} + + stages = [ + _stage( + "boot", + "Boot Watch", + "warn" if boot.get("changed") else "pass", + boot.get("summary", "Boot preflight ready."), + ), + _stage( + "storage", + "Persistent Mount", + "pass" if mount.get("confidence") in {"high", "medium"} else "warn", + mount.get("summary", "Choose a persistent mount."), + "storage", + ), + _stage( + "repair", + "Auto-Repair Queue", + "warn" if repairs.get("auto_count") or repairs.get("needs_user_count") else "pass", + repairs.get("summary", "No repairs queued."), + "repair-workspace", + ), + _stage( + "agent", + "Local Agent Helper", + "pass" if agent.get("local_agent_ready") else "warn", + agent.get("summary", "Offline helper ready."), + agent.get("recommended_action_id"), + ), + _stage( + "models", + "Model Fit Advisor", + "pass" if models.get("recommended_ids") else "warn", + models.get("summary", "Model fit advisor ready."), + "starter-models", + ), + _stage( + "apps", + "App Smoke Tests", + "pass" if smoke.get("warnings", 0) == 0 and smoke.get("failed", 0) == 0 else "warn", + smoke.get("summary", "Smoke tests ready."), + "smoke-tests", + ), + _stage( + "receipts", + "Install Receipts", + "pass" if receipts.get("unhealthy", 0) == 0 else "warn", + f"{receipts.get('count', 0)} receipt(s), {receipts.get('unhealthy', 0)} need repair.", + "repair-receipts", + ), + ] + blocked = int(compatibility.get("blocked_count", 0) or 0) + return { + "summary": ( + "Ready to build" if blocked == 0 and all(stage["status"] == "pass" for stage in stages[:2]) + else f"{blocked} blocked compatibility item(s); nvWizard has next steps." + ), + "ready": blocked == 0, + "stages": stages, + "boot_preflight": boot, + "mount_autopilot": mount, + "auto_repair": repairs, + "smoke_tests": smoke, + "model_fit": models, + } diff --git a/nvh/integrations/model_fit.py b/nvh/integrations/model_fit.py new file mode 100644 index 0000000..d93ef9f --- /dev/null +++ b/nvh/integrations/model_fit.py @@ -0,0 +1,119 @@ +"""Student-friendly local model fit recommendations.""" + +from __future__ import annotations + +from typing import Any + +from nvh.integrations.storage import storage_status +from nvh.integrations.studio_packs import model_catalog_with_status + +USE_CASE_LABELS = { + "starter": "Best first install", + "chat": "Chat and homework help", + "coding": "Coding helper", + "reasoning": "Math and reasoning", + "embedding": "Search and notes", + "vision": "Vision/image understanding", +} + + +def _score_model(model: dict[str, Any], *, free_gb: float | None, vram_gb: int) -> tuple[int, list[str]]: + score = int(100 - int(model.get("priority", 100))) + reasons: list[str] = [] + if model.get("installed"): + score += 40 + reasons.append("already installed") + if model.get("recommended"): + score += 35 + reasons.append("recommended default") + if model.get("fits_vram"): + score += 20 + reasons.append("fits detected VRAM") + elif vram_gb: + score -= 35 + reasons.append("may exceed detected VRAM") + estimated_disk = float(model.get("estimated_disk_gb", 0) or 0) + if free_gb is not None and estimated_disk > free_gb: + score -= 50 + reasons.append("not enough free storage") + elif free_gb is not None and estimated_disk * 2 < free_gb: + score += 10 + reasons.append("comfortable disk fit") + capabilities = {str(cap).lower() for cap in model.get("capabilities", [])} + if "coding" in capabilities: + score += 8 + if "fast" in capabilities: + score += 5 + return max(0, score), reasons + + +def model_fit_report(home_dir: str | None = None) -> dict[str, Any]: + """Return a simplified model queue by student use case.""" + catalog = model_catalog_with_status() + storage = storage_status(home_dir=home_dir).as_dict() + free_gb = storage.get("free_gb") + vram_gb = int(catalog.get("detected_vram_gb") or 0) + ranked: list[dict[str, Any]] = [] + + for model in catalog.get("models", []): + score, reasons = _score_model(model, free_gb=free_gb, vram_gb=vram_gb) + capabilities = [str(cap).lower() for cap in model.get("capabilities", [])] + if model.get("recommended"): + use_case = "starter" + elif "coding" in capabilities: + use_case = "coding" + elif "reasoning" in capabilities: + use_case = "reasoning" + elif "embedding" in capabilities: + use_case = "embedding" + elif "vision" in capabilities or "vision-capable family" in capabilities: + use_case = "vision" + else: + use_case = str(model.get("category") or "chat") + ranked.append({ + **model, + "fit_score": score, + "fit_reasons": reasons, + "use_case": use_case, + "use_case_label": USE_CASE_LABELS.get(use_case, use_case.title()), + }) + + ranked.sort(key=lambda item: item["fit_score"], reverse=True) + best_by_use_case: dict[str, dict[str, Any]] = {} + for model in ranked: + best_by_use_case.setdefault(model["use_case"], model) + + recommended_queue = [ + model for model in ranked + if model.get("recommended") and not model.get("installed") + ] + if not recommended_queue: + recommended_queue = [ + model for model in ranked + if model.get("fits_vram") and not model.get("installed") + ][:3] + queue_disk_gb = round( + sum(float(model.get("estimated_disk_gb", 0) or 0) for model in recommended_queue), + 1, + ) + storage_fits_queue = free_gb is None or queue_disk_gb <= float(free_gb) + if storage_fits_queue: + summary = f"{len(recommended_queue)} model(s) queued for the detected {vram_gb or 'unknown'} GB VRAM profile." + else: + summary = ( + f"{len(recommended_queue)} model(s) fit the GPU profile, but the queue needs " + f"about {queue_disk_gb} GB and storage reports {free_gb} GB free." + ) + + return { + "summary": summary, + "detected_vram_gb": vram_gb, + "free_gb": free_gb, + "recommended_queue_disk_gb": queue_disk_gb, + "storage_fits_queue": storage_fits_queue, + "recommended_ids": [model["id"] for model in recommended_queue], + "best_by_use_case": best_by_use_case, + "models": ranked, + "ollama_available": catalog.get("ollama_available", False), + "ollama_running": catalog.get("ollama_running", False), + } diff --git a/nvh/integrations/mount_autopilot.py b/nvh/integrations/mount_autopilot.py new file mode 100644 index 0000000..8c7326c --- /dev/null +++ b/nvh/integrations/mount_autopilot.py @@ -0,0 +1,491 @@ +"""Persistent mount discovery for rootless cloud desktop sessions.""" + +from __future__ import annotations + +import os +import shutil +from dataclasses import asdict, dataclass, field +from pathlib import Path +from typing import Any + +from nvh.integrations.storage import ensure_storage, storage_status + +DEFAULT_MIN_FREE_GB = 20.0 + +NETWORK_FS_TYPES = { + "cifs", + "smb3", + "nfs", + "nfs4", + "sshfs", + "fuse.sshfs", + "9p", + "davfs", + "ceph", + "glusterfs", +} +EPHEMERAL_FS_TYPES = { + "autofs", + "cgroup", + "cgroup2", + "configfs", + "debugfs", + "devpts", + "devtmpfs", + "fuse.portal", + "mqueue", + "overlay", + "proc", + "ramfs", + "securityfs", + "squashfs", + "sysfs", + "tmpfs", + "tracefs", +} +LOCAL_BLOCK_FS_TYPES = { + "bcachefs", + "btrfs", + "ext2", + "ext3", + "ext4", + "f2fs", + "xfs", + "zfs", +} +PREFERRED_MOUNT_PREFIXES = ( + "/mnt", + "/media", + "/workspace", + "/data", + "/persistent", + "/storage", +) +OS_MOUNT_PREFIXES = ( + "/boot", + "/etc", + "/nix", + "/opt", + "/root", + "/run", + "/snap", + "/tmp", + "/usr", + "/var", +) + + +@dataclass(frozen=True) +class MountInfo: + """Linux mount metadata for a path.""" + + mount_point: Path + fs_type: str + source: str + options: set[str] + + +@dataclass(frozen=True) +class MountCandidate: + """One possible persistent storage location.""" + + path: str + recommended_home: str + label: str + source: str + exists: bool + writable: bool + free_gb: float | None + total_gb: float | None + fs_type: str | None + device: str | None + mount_point: str | None + read_only: bool + network_mount: bool + os_mount: bool + large_block_mount: bool + score: int + warnings: list[str] = field(default_factory=list) + evidence: list[str] = field(default_factory=list) + + def as_dict(self) -> dict[str, Any]: + return asdict(self) + + +def _expand(path: str | Path) -> Path: + return Path(os.path.expandvars(str(path))).expanduser() + + +def _path_exists(path: Path) -> bool: + try: + return path.exists() + except Exception: + return False + + +def _path_is_dir(path: Path) -> bool: + try: + return path.is_dir() + except Exception: + return False + + +def _candidate_key(path: Path) -> str: + try: + return str(path.resolve() if _path_exists(path) else path) + except Exception: + return str(path) + + +def _decode_mount_path(value: str) -> str: + return ( + value.replace("\\040", " ") + .replace("\\011", "\t") + .replace("\\012", "\n") + .replace("\\134", "\\") + ) + + +def _parse_mountinfo_line(line: str) -> MountInfo | None: + parts = line.strip().split() + if "-" not in parts or len(parts) < 10: + return None + separator = parts.index("-") + if separator + 3 >= len(parts): + return None + mount_point = Path(_decode_mount_path(parts[4])) + options = set(parts[5].split(",")) + fs_type = parts[separator + 1].lower() + source = _decode_mount_path(parts[separator + 2]) + if separator + 3 < len(parts): + options.update(parts[separator + 3].split(",")) + return MountInfo(mount_point=mount_point, fs_type=fs_type, source=source, options=options) + + +def _parse_mounts_line(line: str) -> MountInfo | None: + parts = line.strip().split() + if len(parts) < 4: + return None + return MountInfo( + mount_point=Path(_decode_mount_path(parts[1])), + fs_type=parts[2].lower(), + source=_decode_mount_path(parts[0]), + options=set(parts[3].split(",")), + ) + + +def _mount_table() -> list[MountInfo]: + for mount_file, parser in ( + (Path("/proc/self/mountinfo"), _parse_mountinfo_line), + (Path("/proc/mounts"), _parse_mounts_line), + ): + try: + lines = mount_file.read_text(encoding="utf-8").splitlines() + except Exception: + continue + mounts = [mount for line in lines if (mount := parser(line))] + if mounts: + return mounts + return [] + + +def _is_relative_to(path: Path, parent: Path) -> bool: + try: + path.relative_to(parent) + return True + except ValueError: + return False + + +def _mount_info_for_path(path: Path) -> MountInfo | None: + probe = _nearest_existing(path) or path + mounts = _mount_table() + matches = [mount for mount in mounts if _is_relative_to(probe, mount.mount_point)] + if not matches: + return None + return max(matches, key=lambda mount: len(mount.mount_point.parts)) + + +def _is_network_mount(info: MountInfo | None) -> bool: + return bool(info and info.fs_type in NETWORK_FS_TYPES) + + +def _is_ephemeral_mount(info: MountInfo | None) -> bool: + return bool(info and info.fs_type in EPHEMERAL_FS_TYPES) + + +def _is_read_only_mount(info: MountInfo | None) -> bool: + return bool(info and "ro" in info.options) + + +def _is_os_mount(path: Path, info: MountInfo | None) -> bool: + if info is None: + return False + mount_point = info.mount_point.as_posix() + if mount_point == "/": + return True + return any(mount_point == prefix or mount_point.startswith(f"{prefix}/") for prefix in OS_MOUNT_PREFIXES) + + +def _is_large_block_mount(info: MountInfo | None, total_gb: float | None) -> bool: + if info is None or total_gb is None: + return False + if _is_network_mount(info) or _is_ephemeral_mount(info): + return False + local_signal = info.fs_type in LOCAL_BLOCK_FS_TYPES or info.source.startswith("/dev/") + return local_signal and total_gb >= 180 + + +def _nearest_existing(path: Path) -> Path | None: + current = path + while not _path_exists(current) and current.parent != current: + current = current.parent + return current if _path_exists(current) else None + + +def _disk_usage(path: Path) -> tuple[float | None, float | None]: + probe = _nearest_existing(path) + if probe is None: + return None, None + try: + usage = shutil.disk_usage(probe) + except Exception: + return None, None + gb = 1024 ** 3 + return round(usage.free / gb, 1), round(usage.total / gb, 1) + + +def _is_writable(path: Path) -> bool: + probe = _nearest_existing(path) + if probe is None: + return False + try: + return os.access(probe, os.W_OK) + except Exception: + return False + + +def _looks_ephemeral(path: Path) -> bool: + parts = {part.lower() for part in path.parts} + return bool(parts.intersection({"tmp", "temp", "run", "var", "cache", "pytest-tmp"})) + + +def _evidence(path: Path) -> list[str]: + evidence: list[str] = [] + for marker in ("nvh-env.sh", "receipts", "models", "comfyui", "studio", "apps"): + if _path_exists(path / marker): + evidence.append(marker) + return evidence + + +def _recommended_home(path: Path) -> Path: + if path.name.lower() in {"nvh", "nvhive", ".nvh"}: + return path + return path / "nvhive" + + +def _score_candidate( + path: Path, + *, + source: str, + min_free_gb: float, +) -> MountCandidate: + exists = _path_exists(path) + writable = _is_writable(path) + free_gb, total_gb = _disk_usage(path) + mount_info = _mount_info_for_path(path) + fs_type = mount_info.fs_type if mount_info else None + device = mount_info.source if mount_info else None + mount_point = str(mount_info.mount_point) if mount_info else None + read_only = _is_read_only_mount(mount_info) + network_mount = _is_network_mount(mount_info) + ephemeral_mount = _is_ephemeral_mount(mount_info) + os_mount = _is_os_mount(path, mount_info) + large_block_mount = _is_large_block_mount(mount_info, total_gb) + evidence = _evidence(path) + _evidence(_recommended_home(path)) + warnings: list[str] = [] + score = 0 + + if exists: + score += 10 + else: + warnings.append("Path does not exist yet; nvHive can create the final NVH_HOME inside it if parent is writable.") + if writable: + score += 30 + else: + score -= 70 + warnings.append("Path is not writable by this user.") + if free_gb is not None: + if free_gb >= min_free_gb: + score += 25 + else: + warnings.append(f"Only {free_gb} GB free; recommended minimum is {min_free_gb:.0f} GB.") + if total_gb is not None: + if total_gb >= 900: + score += 55 + elif total_gb >= 450: + score += 45 + elif total_gb >= 180: + score += 35 + elif total_gb >= 50: + score += 10 + if evidence: + score += 35 + if source.startswith("env:") or source == "current": + score += 15 + if large_block_mount: + score += 45 + evidence.append("large-writable-block-mount") + if mount_info: + evidence.append(f"mount:{mount_info.mount_point}") + evidence.append(f"fs:{mount_info.fs_type}") + if path.as_posix().startswith(PREFERRED_MOUNT_PREFIXES): + score += 15 + home_path = Path.home() + if _is_relative_to(path, home_path) and large_block_mount and not os_mount: + score += 20 + evidence.append("home-on-persistent-block-mount") + if read_only: + score -= 80 + warnings.append("Mount is read-only; nvHive needs a user-writable persistent block volume.") + if network_mount: + score -= 30 + warnings.append("Network/share filesystem detected; prefer writable persistent block storage for models.") + if ephemeral_mount: + score -= 60 + warnings.append("Filesystem looks ephemeral for a cloud desktop.") + if os_mount: + score -= 45 + warnings.append("Path appears to live on the OS/root disk; use the persistent block-backed home or data mount.") + if _looks_ephemeral(path): + score -= 40 + warnings.append("Path looks ephemeral for a cloud desktop.") + if total_gb is not None and total_gb < 180 and not source.startswith("env:"): + warnings.append("Disk is smaller than the expected 200 GB+ persistent model/workspace volume.") + label = path.name or str(path) + if fs_type and total_gb is not None: + label = f"{label} ({fs_type}, {total_gb:g} GB)" + elif fs_type: + label = f"{label} ({fs_type})" + return MountCandidate( + path=str(path), + recommended_home=str(_recommended_home(path)), + label=label, + source=source, + exists=exists, + writable=writable, + free_gb=free_gb, + total_gb=total_gb, + fs_type=fs_type, + device=device, + mount_point=mount_point, + read_only=read_only, + network_mount=network_mount, + os_mount=os_mount, + large_block_mount=large_block_mount, + score=max(0, score), + warnings=warnings, + evidence=evidence, + ) + + +def _common_roots() -> list[tuple[str, Path]]: + roots: list[tuple[str, Path]] = [] + for env_name in ( + "NVH_HOME", + "NVH_MOUNT", + "PERSISTENT_HOME", + "PERSISTENT_DIR", + "WORKSPACE", + "PROJECTS", + ): + value = os.environ.get(env_name) + if value: + roots.append((f"env:{env_name}", _expand(value))) + roots.extend([ + ("common", Path("/mnt")), + ("common", Path("/media") / os.environ.get("USER", "")), + ("common", Path("/workspace")), + ("common", Path("/data")), + ("common", Path("/persistent")), + ("common", Path("/storage")), + ("home", Path.home()), + ]) + for mount in _mount_table(): + mount_path = mount.mount_point.as_posix() + if mount_path == "/" or _is_ephemeral_mount(mount): + continue + roots.append(("mount", mount.mount_point)) + return roots + + +def _candidate_paths(extra_roots: list[str | Path] | None = None) -> list[tuple[str, Path]]: + candidates: list[tuple[str, Path]] = [] + seen: set[str] = set() + for source, root in [*_common_roots(), *[("candidate", _expand(path)) for path in extra_roots or []]]: + root = root.expanduser() + options = [root] + if _path_exists(root) and _path_is_dir(root): + try: + options.extend(path for path in root.iterdir() if _path_is_dir(path)) + except Exception: + pass + for option in options: + key = _candidate_key(option) + if key in seen: + continue + seen.add(key) + candidates.append((source, option)) + return candidates + + +def mount_autopilot_report( + *, + min_free_gb: float = DEFAULT_MIN_FREE_GB, + extra_roots: list[str | Path] | None = None, +) -> dict[str, Any]: + """Rank likely persistent mounts and recommend an NVH_HOME.""" + current = storage_status(min_free_gb=min_free_gb).as_dict() + candidates = [ + _score_candidate(path, source=source, min_free_gb=min_free_gb) + for source, path in _candidate_paths(extra_roots) + ] + candidates.sort(key=lambda item: item.score, reverse=True) + recommended = candidates[0] if candidates else None + confidence = "none" + if recommended: + if recommended.score >= 75: + confidence = "high" + elif recommended.score >= 45: + confidence = "medium" + else: + confidence = "low" + return { + "summary": ( + f"Recommended NVH_HOME: {recommended.recommended_home}" + if recommended + else "No persistent mount candidates found." + ), + "confidence": confidence, + "current": current, + "recommended": recommended.as_dict() if recommended else None, + "candidates": [candidate.as_dict() for candidate in candidates[:8]], + } + + +def activate_recommended_mount( + *, + min_free_gb: float = DEFAULT_MIN_FREE_GB, + extra_roots: list[str | Path] | None = None, +) -> dict[str, Any]: + """Create and activate the best discovered NVH_HOME.""" + report = mount_autopilot_report(min_free_gb=min_free_gb, extra_roots=extra_roots) + recommended = report.get("recommended") + if not recommended: + raise RuntimeError("No persistent mount candidate found.") + status = ensure_storage(recommended["recommended_home"], min_free_gb=min_free_gb) + return { + "summary": f"Activated NVH_HOME at {status.layout.home}", + "storage": status.as_dict(), + "mount_autopilot": report, + } diff --git a/nvh/integrations/production_readiness.py b/nvh/integrations/production_readiness.py new file mode 100644 index 0000000..3e350e1 --- /dev/null +++ b/nvh/integrations/production_readiness.py @@ -0,0 +1,473 @@ +"""Production readiness gates for rootless NVIDIA cloud desktop installs. + +This report is intentionally conservative. It can prove CI-style and local +rootless invariants anywhere, but it will not claim production readiness until +the real Linux GPU VM image has passed acceptance. +""" + +from __future__ import annotations + +import os +import platform +from dataclasses import asdict, dataclass +from datetime import UTC, datetime +from pathlib import Path +from typing import Any + +from nvh.integrations.boot_preflight import boot_preflight_status +from nvh.integrations.compatibility import compatibility_report +from nvh.integrations.model_fit import model_fit_report +from nvh.integrations.mount_autopilot import mount_autopilot_report +from nvh.integrations.receipts import receipt_summary +from nvh.integrations.runtime import runtime_status +from nvh.integrations.smoke_tests import smoke_test_report +from nvh.integrations.storage import storage_status +from nvh.integrations.studio_packs import catalog_with_status + +TARGET_VM_CHECKLIST = [ + "Fresh no-root install from GitHub or PyPI on the NVIDIA Linux VM.", + "Confirm NVH_HOME lands on the persistent 200 GB+ block-backed mount.", + "Install AI Starter from the wizard and verify Ollama plus recommended model queue.", + "Install Graphics Creator Studio and launch ComfyUI with starter examples.", + "Install Game Dev Lab and verify Blender/Godot helper launchers.", + "Install Music Producer Studio and verify helper workspaces without sudo.", + "Reboot or reconnect the VM and confirm boot preflight detects no unexpected drift.", +] + + +@dataclass(frozen=True) +class ReadinessGate: + """One productization gate in the rootless setup path.""" + + id: str + title: str + status: str + summary: str + detail: str = "" + recommendation: str = "" + source: str = "local" + + def as_dict(self) -> dict[str, Any]: + return asdict(self) + + +def _gate( + gate_id: str, + title: str, + status: str, + summary: str, + *, + detail: str = "", + recommendation: str = "", + source: str = "local", +) -> ReadinessGate: + return ReadinessGate( + id=gate_id, + title=title, + status=status, + summary=summary, + detail=detail, + recommendation=recommendation, + source=source, + ) + + +def _flag_enabled(value: str | None) -> bool: + return str(value or "").strip().lower() in {"1", "true", "yes", "on", "passed", "validated"} + + +def _target_vm_validated(explicit: bool | None) -> bool: + if explicit is not None: + return bool(explicit) + return _flag_enabled(os.environ.get("NVH_TARGET_VM_VALIDATED")) + + +def _path_detail(path: Any) -> str: + if path in (None, ""): + return "" + try: + return str(Path(path)) + except Exception: + return str(path) + + +def _storage_gate(storage: dict[str, Any]) -> ReadinessGate: + configured = storage.get("configured_by") != "default" + ok = bool(storage.get("ok")) + home = _path_detail(storage.get("layout", {}).get("home")) + free_gb = storage.get("free_gb") + detail = f"{home} ({free_gb if free_gb is not None else '?'} GB free)" + if ok and configured: + return _gate( + "persistent-storage", + "Persistent storage", + "pass", + "NVH_HOME is writable and explicitly configured.", + detail=detail, + ) + if ok: + return _gate( + "persistent-storage", + "Persistent storage", + "warn", + "Storage is writable, but NVH_HOME is still using the default path.", + detail=detail, + recommendation="Let mount autopilot activate the persistent block volume before large downloads.", + ) + return _gate( + "persistent-storage", + "Persistent storage", + "blocked", + "Selected storage is not ready for large rootless installs.", + detail="; ".join(storage.get("warnings", [])) or detail, + recommendation="Use a writable persistent block-backed mount and rerun storage detection.", + ) + + +def _mount_gate(mount: dict[str, Any], storage: dict[str, Any]) -> ReadinessGate: + current_ok = bool(storage.get("ok") and storage.get("configured_by") != "default") + recommended = mount.get("recommended") or {} + if current_ok: + return _gate( + "mount-autopilot", + "Mount autopilot", + "pass", + "Active NVH_HOME is already explicit and writable.", + detail=_path_detail(storage.get("layout", {}).get("home")), + ) + if recommended: + confidence = mount.get("confidence", "none") + safe_candidate = bool( + recommended.get("writable") + and not recommended.get("read_only") + and not recommended.get("network_mount") + and not recommended.get("os_mount") + ) + status = "warn" if safe_candidate else "blocked" + return _gate( + "mount-autopilot", + "Mount autopilot", + status, + f"Recommended persistent home: {recommended.get('recommended_home')}", + detail=f"confidence={confidence}, score={recommended.get('score')}", + recommendation=( + "Activate this mount before installing models." + if safe_candidate + else "Pick a writable local block mount; read-only shares and OS disks are unsafe." + ), + ) + return _gate( + "mount-autopilot", + "Mount autopilot", + "warn", + "No persistent mount candidate was found automatically.", + recommendation="Set NVH_HOME to the mounted block volume before running the wizard.", + ) + + +def _runtime_gate(runtime: dict[str, Any]) -> ReadinessGate: + strategy = str(runtime.get("strategy") or "") + if strategy in {"python-venv", "micromamba-fallback"}: + return _gate( + "runtime-toolchain", + "Rootless runtime", + "pass", + f"Runtime strategy is {strategy}.", + detail=runtime.get("python_version", ""), + ) + return _gate( + "runtime-toolchain", + "Rootless runtime", + "warn", + "Python venv/pip is incomplete and the micromamba fallback is not installed yet.", + detail="; ".join(runtime.get("notes", [])), + recommendation="Install the rootless runtime fallback before ComfyUI, agents, or music tools.", + ) + + +def _gpu_gate(compatibility: dict[str, Any]) -> ReadinessGate: + host = compatibility.get("host", {}) + system = host.get("system") or platform.system() + gpu = host.get("gpu", {}) + gpu_name = gpu.get("name") + if system != "Linux": + return _gate( + "linux-gpu-session", + "Linux NVIDIA GPU session", + "warn", + "Current machine is not the target Linux GPU VM.", + detail=str(system), + recommendation="Run the same readiness report tomorrow on the NVIDIA Linux VM.", + source="target-vm", + ) + if gpu_name: + return _gate( + "linux-gpu-session", + "Linux NVIDIA GPU session", + "pass", + f"Detected {gpu_name}.", + detail=f"driver={gpu.get('driver_version') or 'unknown'}, cuda={gpu.get('cuda_version') or 'unknown'}", + source="target-vm", + ) + return _gate( + "linux-gpu-session", + "Linux NVIDIA GPU session", + "blocked", + "Linux is detected, but no NVIDIA GPU is visible to nvHive.", + detail=str(gpu.get("detection_status") or "not-detected"), + recommendation="Start a GPU-backed session or ask the provider to expose NVIDIA devices.", + source="target-vm", + ) + + +def _compatibility_gate(compatibility: dict[str, Any]) -> ReadinessGate: + blocked = int(compatibility.get("blocked_count", 0) or 0) + issues = int(compatibility.get("issue_count", 0) or 0) + fixable = int(compatibility.get("rootless_fixable_count", 0) or 0) + if blocked: + return _gate( + "app-compatibility", + "App compatibility", + "blocked", + f"{blocked} app/profile item(s) require base image, driver, or OS changes.", + detail=compatibility.get("summary", ""), + recommendation="Resolve blocked compatibility items before production release.", + ) + if issues: + return _gate( + "app-compatibility", + "App compatibility", + "warn", + f"{issues} compatibility item(s) need attention; {fixable} are rootless-fixable.", + detail=compatibility.get("summary", ""), + recommendation="Run safe repairs and install requested mission dependencies.", + ) + return _gate( + "app-compatibility", + "App compatibility", + "pass", + "No app compatibility blockers detected.", + detail=compatibility.get("summary", ""), + ) + + +def _boot_gate(boot: dict[str, Any]) -> ReadinessGate: + if not boot.get("checked_at"): + return _gate( + "boot-preflight", + "Boot preflight", + "warn", + "Boot preflight has not captured a baseline yet.", + recommendation="Run boot preflight once on app startup or with the setup recheck button.", + ) + if boot.get("changed"): + return _gate( + "boot-preflight", + "Boot preflight", + "warn", + "The base VM image changed since the last check.", + detail=boot.get("summary", ""), + recommendation="Review driver, CUDA, Python, storage, and model recommendations before launch.", + ) + if boot.get("needs_attention"): + return _gate( + "boot-preflight", + "Boot preflight", + "warn", + "Boot preflight still has items needing attention.", + detail=boot.get("summary", ""), + recommendation="Run safe repairs or follow the recommended action.", + ) + return _gate( + "boot-preflight", + "Boot preflight", + "pass", + "Boot baseline is captured and unchanged.", + detail=boot.get("summary", ""), + ) + + +def _smoke_gate(smoke: dict[str, Any]) -> ReadinessGate: + failed = int(smoke.get("failed", 0) or 0) + warnings = int(smoke.get("warnings", 0) or 0) + if failed: + return _gate( + "smoke-tests", + "Smoke tests", + "blocked", + f"{failed} smoke test(s) failed.", + detail=smoke.get("summary", ""), + recommendation="Fix failed app health checks before production release.", + ) + if warnings: + return _gate( + "smoke-tests", + "Smoke tests", + "warn", + f"{warnings} smoke test warning(s) remain.", + detail=smoke.get("summary", ""), + recommendation="Warnings are acceptable for beta, but should be explained in release notes.", + ) + return _gate("smoke-tests", "Smoke tests", "pass", "All lightweight smoke checks passed.", detail=smoke.get("summary", "")) + + +def _model_gate(model_fit: dict[str, Any]) -> ReadinessGate: + if model_fit.get("storage_fits_queue") is False: + return _gate( + "model-fit", + "Model fit", + "blocked", + "Recommended model queue does not fit the detected persistent storage.", + detail=model_fit.get("summary", ""), + recommendation="Reduce the default model queue or use a larger persistent volume.", + ) + return _gate( + "model-fit", + "Model fit", + "pass", + "Recommended model queue fits current storage assumptions.", + detail=model_fit.get("summary", ""), + ) + + +def _receipts_gate(receipts: dict[str, Any]) -> ReadinessGate: + unhealthy = int(receipts.get("unhealthy", 0) or 0) + count = int(receipts.get("count", 0) or 0) + if unhealthy: + return _gate( + "install-receipts", + "Install receipts", + "warn", + f"{unhealthy} install receipt(s) need repair.", + detail=f"{count} total receipt(s)", + recommendation="Repair or reinstall unhealthy app packs before calling the VM production-ready.", + ) + return _gate( + "install-receipts", + "Install receipts", + "pass", + "No unhealthy install receipts detected.", + detail=f"{count} total receipt(s)", + ) + + +def _pack_safety_gate(catalog: dict[str, Any]) -> ReadinessGate: + packs = catalog.get("packs", []) + non_rootless = [pack.get("id") for pack in packs if not pack.get("no_root")] + if non_rootless: + return _gate( + "rootless-pack-safety", + "Rootless pack safety", + "blocked", + "Some setup packs are not marked no-root.", + detail=", ".join(str(item) for item in non_rootless), + recommendation="Every one-click wizard pack must install into NVH_HOME or explain its external requirement.", + ) + if not packs: + return _gate( + "rootless-pack-safety", + "Rootless pack safety", + "warn", + "Studio pack catalog is empty.", + recommendation="Restore the bundled studio pack catalog before release.", + ) + return _gate( + "rootless-pack-safety", + "Rootless pack safety", + "pass", + f"{len(packs)} studio pack(s) are marked no-root.", + ) + + +def _target_acceptance_gate(validated: bool) -> ReadinessGate: + if validated: + return _gate( + "target-vm-acceptance", + "Target VM acceptance", + "pass", + "Real NVIDIA Linux VM acceptance has been marked complete.", + source="target-vm", + ) + return _gate( + "target-vm-acceptance", + "Target VM acceptance", + "warn", + "Real NVIDIA Linux VM acceptance is still required.", + recommendation="Run the checklist on the target VM and set NVH_TARGET_VM_VALIDATED=1 for the final report.", + source="target-vm", + ) + + +def production_readiness_report( + home_dir: str | Path | None = None, + *, + target_vm_validated: bool | None = None, +) -> dict[str, Any]: + """Return a conservative production readiness report for nvWizard.""" + storage = storage_status(home_dir=home_dir, min_free_gb=20).as_dict() + mount = mount_autopilot_report(min_free_gb=20, extra_roots=[home_dir] if home_dir else None) + runtime = runtime_status().as_dict() + compatibility = compatibility_report(home_dir=home_dir) + boot = boot_preflight_status(home_dir=home_dir, run_if_missing=False) + smoke = smoke_test_report(home_dir=home_dir) + model_fit = model_fit_report(home_dir=home_dir) + receipts = receipt_summary() + catalog = catalog_with_status() + validated = _target_vm_validated(target_vm_validated) + + gates = [ + _storage_gate(storage), + _mount_gate(mount, storage), + _runtime_gate(runtime), + _gpu_gate(compatibility), + _compatibility_gate(compatibility), + _boot_gate(boot), + _smoke_gate(smoke), + _model_gate(model_fit), + _receipts_gate(receipts), + _pack_safety_gate(catalog), + _target_acceptance_gate(validated), + ] + blocked = [gate for gate in gates if gate.status == "blocked"] + warnings = [gate for gate in gates if gate.status == "warn"] + passed = [gate for gate in gates if gate.status == "pass"] + pilot_ready = not blocked + production_ready = pilot_ready and not warnings + status = "production-ready" if production_ready else "pilot-ready" if pilot_ready else "blocked" + + if production_ready: + summary = "Ready for production release on the validated target VM." + elif pilot_ready: + summary = "Ready for a controlled beta or target-VM acceptance run." + else: + summary = f"{len(blocked)} blocker(s) must be resolved before beta or production." + + next_actions = [ + gate.recommendation or gate.summary + for gate in [*blocked, *warnings] + if gate.recommendation or gate.summary + ][:6] + return { + "checked_at": datetime.now(UTC).isoformat(), + "status": status, + "summary": summary, + "pilot_ready": pilot_ready, + "production_ready": production_ready, + "target_vm_validated": validated, + "counts": { + "passed": len(passed), + "warnings": len(warnings), + "blocked": len(blocked), + "total": len(gates), + }, + "gates": [gate.as_dict() for gate in gates], + "next_actions": next_actions, + "target_vm_checklist": TARGET_VM_CHECKLIST, + "inputs": { + "home_dir": str(home_dir) if home_dir else None, + "storage_home": storage.get("layout", {}).get("home"), + "storage_configured_by": storage.get("configured_by"), + "runtime_strategy": runtime.get("strategy"), + "boot_checked_at": boot.get("checked_at"), + }, + } diff --git a/nvh/integrations/receipts.py b/nvh/integrations/receipts.py new file mode 100644 index 0000000..0ffc1f0 --- /dev/null +++ b/nvh/integrations/receipts.py @@ -0,0 +1,224 @@ +"""Install receipts for rootless workstation tools. + +Receipts are small JSON manifests written under ``NVH_HOME`` after nvHive +installs a tool, model, or workspace. They make setup resumable and auditable +without requiring root access or a system package database. +""" + +from __future__ import annotations + +import json +import time +from dataclasses import asdict, dataclass, field +from pathlib import Path +from typing import Any + +from nvh.integrations.storage import storage_layout + +SCHEMA_VERSION = 1 + + +@dataclass(frozen=True) +class InstallReceipt: + """Manifest for one rootless install artifact.""" + + id: str + kind: str + item_id: str + title: str + status: str + installed_at: str + updated_at: str + install_path: str + version: str | None = None + source_urls: list[str] = field(default_factory=list) + launchers: list[str] = field(default_factory=list) + models: list[str] = field(default_factory=list) + files: list[str] = field(default_factory=list) + no_root: bool = True + metadata: dict[str, Any] = field(default_factory=dict) + schema_version: int = SCHEMA_VERSION + + def as_dict(self) -> dict[str, Any]: + return asdict(self) + + +def _now() -> str: + return time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()) + + +def receipts_root(*, create: bool = True) -> Path: + root = storage_layout().home / "receipts" + if create: + root.mkdir(parents=True, exist_ok=True) + return root + + +def receipt_id(kind: str, item_id: str) -> str: + return f"{kind}:{item_id}" + + +def _safe_slug(value: str) -> str: + slug = "".join(ch if ch.isalnum() or ch in {"-", "_", "."} else "_" for ch in value) + slug = slug.strip("._") + if not slug: + raise KeyError("Invalid receipt id") + return slug + + +def _receipt_path(identifier: str) -> Path: + return receipts_root() / f"{_safe_slug(identifier)}.json" + + +def write_receipt( + *, + kind: str, + item_id: str, + title: str, + install_path: str | Path, + status: str = "installed", + version: str | None = None, + source_urls: list[str] | tuple[str, ...] | None = None, + launchers: list[str] | tuple[str, ...] | None = None, + models: list[str] | tuple[str, ...] | None = None, + files: list[str] | tuple[str, ...] | None = None, + metadata: dict[str, Any] | None = None, + no_root: bool = True, +) -> dict[str, Any]: + """Create or update one install receipt.""" + identifier = receipt_id(kind, item_id) + path = _receipt_path(identifier) + previous: dict[str, Any] = {} + if path.exists(): + try: + previous = json.loads(path.read_text(encoding="utf-8")) + except Exception: + previous = {} + + now = _now() + receipt = InstallReceipt( + id=identifier, + kind=kind, + item_id=item_id, + title=title, + status=status, + installed_at=str(previous.get("installed_at") or now), + updated_at=now, + install_path=str(install_path), + version=version, + source_urls=list(source_urls or []), + launchers=list(launchers or []), + models=list(models or []), + files=list(files or []), + no_root=no_root, + metadata=metadata or {}, + ) + data = receipt.as_dict() + tmp = path.with_suffix(".json.tmp") + tmp.write_text(json.dumps(data, indent=2, sort_keys=True) + "\n", encoding="utf-8") + tmp.replace(path) + return enrich_receipt(data) + + +def load_receipt(identifier: str) -> dict[str, Any]: + path = _receipt_path(identifier) + if not path.exists(): + raise KeyError(f"Unknown receipt: {identifier}") + return enrich_receipt(json.loads(path.read_text(encoding="utf-8"))) + + +def list_receipts( + *, + kind: str | None = None, + status: str | None = None, + limit: int = 100, +) -> list[dict[str, Any]]: + """Return recent receipts sorted by updated time descending.""" + receipts: list[dict[str, Any]] = [] + root = receipts_root(create=False) + if not root.exists(): + return [] + for path in root.glob("*.json"): + try: + data = enrich_receipt(json.loads(path.read_text(encoding="utf-8"))) + except Exception: + continue + if kind and data.get("kind") != kind: + continue + if status and data.get("status") != status: + continue + receipts.append(data) + receipts.sort(key=lambda item: str(item.get("updated_at", "")), reverse=True) + return receipts[: max(1, min(limit, 500))] + + +def enrich_receipt(receipt: dict[str, Any]) -> dict[str, Any]: + """Add lightweight health data without mutating the on-disk manifest.""" + install_path = Path(str(receipt.get("install_path", ""))).expanduser() + launchers = [Path(str(path)).expanduser() for path in receipt.get("launchers", [])] + files = [Path(str(path)).expanduser() for path in receipt.get("files", [])] + missing_launchers = [str(path) for path in launchers if not path.exists()] + missing_files = [str(path) for path in files if not path.exists()] + health = { + "install_path_exists": install_path.exists(), + "missing_launchers": missing_launchers, + "missing_files": missing_files, + "healthy": install_path.exists() and not missing_launchers and not missing_files, + } + return {**receipt, "health": health} + + +def receipt_summary() -> dict[str, Any]: + receipts = list_receipts() + by_kind: dict[str, int] = {} + unhealthy = 0 + for receipt in receipts: + by_kind[receipt["kind"]] = by_kind.get(receipt["kind"], 0) + 1 + if not receipt.get("health", {}).get("healthy", False): + unhealthy += 1 + return { + "count": len(receipts), + "by_kind": by_kind, + "unhealthy": unhealthy, + "root": str(receipts_root(create=False)), + "receipts": receipts[:10], + } + + +def repair_plan(identifier: str) -> dict[str, Any]: + receipt = load_receipt(identifier) + kind = receipt["kind"] + item_id = receipt["item_id"] + commands: list[str] + if kind == "studio-pack": + commands = [f"nvh studio --install {item_id} -y"] + elif kind == "studio-model": + target = receipt.get("metadata", {}).get("install_target") or item_id + commands = [f"nvh studio --install-models {target} -y"] + elif kind == "comfyui": + commands = ["nvh workstation --with-comfyui -y"] + else: + commands = [f"nvh setup repair {identifier}"] + return { + "receipt": receipt, + "safe_to_run_without_root": True, + "commands": commands, + "reason": "Re-run the rootless installer for this item and refresh the receipt.", + } + + +def uninstall_plan(identifier: str) -> dict[str, Any]: + receipt = load_receipt(identifier) + paths = [receipt["install_path"], *receipt.get("launchers", []), *receipt.get("files", [])] + unique_paths = [] + for path in paths: + if path and path not in unique_paths: + unique_paths.append(path) + return { + "receipt": receipt, + "safe_to_run_without_root": True, + "destructive": True, + "target_paths": unique_paths, + "commands": [f"rm -rf {path!r}" for path in unique_paths], + "reason": "Preview only. nvHive records the paths so a user can remove rootless files deliberately.", + } diff --git a/nvh/integrations/setup_agent.py b/nvh/integrations/setup_agent.py index c004fa6..5d932c2 100644 --- a/nvh/integrations/setup_agent.py +++ b/nvh/integrations/setup_agent.py @@ -11,7 +11,10 @@ from pathlib import Path from typing import Any +from nvh.integrations.catalog import catalog_status from nvh.integrations.comfyui import detect_comfyui +from nvh.integrations.jobs import list_jobs +from nvh.integrations.receipts import receipt_summary, repair_plan from nvh.integrations.runtime import runtime_status from nvh.integrations.storage import storage_status from nvh.integrations.studio_packs import catalog_with_status, model_catalog_with_status @@ -33,10 +36,118 @@ def as_dict(self) -> dict[str, Any]: return asdict(self) +@dataclass(frozen=True) +class SetupIssue: + """One proactive setup finding the wizard should surface.""" + + id: str + title: str + severity: str + reason: str + fix_action_id: str | None = None + affected_item: str | None = None + current_version: str | None = None + available_version: str | None = None + + def as_dict(self) -> dict[str, Any]: + return asdict(self) + + def _pack_by_id(catalog: dict[str, Any]) -> dict[str, dict[str, Any]]: return {pack["id"]: pack for pack in catalog.get("packs", [])} +def _safe_receipt_summary() -> dict[str, Any]: + try: + return receipt_summary() + except Exception as exc: + return {"count": 0, "by_kind": {}, "unhealthy": 0, "root": None, "receipts": [], "error": str(exc)} + + +def _safe_catalog_status() -> dict[str, Any]: + try: + return catalog_status(refresh=False) + except Exception as exc: + return {"source": "unavailable", "error": str(exc)} + + +def _safe_catalog_data() -> dict[str, Any]: + try: + from nvh.integrations.catalog import load_setup_catalog + + return load_setup_catalog(refresh=False).get("catalog", {}) + except Exception: + return {} + + +def _safe_compatibility_report(home_dir: str | Path | None = None) -> dict[str, Any]: + try: + from nvh.integrations.compatibility import compatibility_report + + return compatibility_report(home_dir=home_dir) + except Exception as exc: + return {"summary": "Compatibility unavailable", "issue_count": 0, "apps": [], "error": str(exc)} + + +def _safe_boot_preflight(home_dir: str | Path | None = None) -> dict[str, Any]: + try: + from nvh.integrations.boot_preflight import boot_preflight_status + + return boot_preflight_status(home_dir=home_dir, run_if_missing=False) + except Exception as exc: + return {"summary": "Boot preflight unavailable", "changes": [], "agent_helper": {}, "error": str(exc)} + + +def _recent_failed_job() -> dict[str, Any] | None: + try: + jobs = list_jobs(limit=10) + except Exception: + return None + for job in jobs: + if job.get("status") in {"failed", "interrupted", "canceled"}: + return job + return None + + +def _action_for_job(job: dict[str, Any]) -> str: + kind = job.get("kind") + if kind == "comfyui-install": + return "comfyui" + if kind == "studio-model-install": + return "starter-models" + if kind == "studio-pack-install": + return "studio-packs" + return "storage" + + +def _catalog_entry_by_id(catalog: dict[str, Any]) -> dict[str, dict[str, Any]]: + entries: dict[str, dict[str, Any]] = {} + for key in ("packs", "models", "comfyui_examples"): + for item in catalog.get(key, []): + item_id = item.get("id") + if item_id: + entries[str(item_id)] = item + return entries + + +def _version_from_catalog(entry: dict[str, Any] | None) -> str | None: + if not entry: + return None + value = entry.get("latest_version") or entry.get("version") + return str(value) if value else None + + +def _looks_older(current: str | None, latest: str | None) -> bool: + if not current or not latest or current == latest: + return False + try: + from packaging.version import Version + + return Version(current) < Version(latest) + except Exception: + return current != latest + + def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: """Return a local setup diagnosis and ranked action list.""" storage = storage_status(home_dir=home_dir, min_free_gb=20) @@ -46,8 +157,17 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: comfy = detect_comfyui() by_pack = _pack_by_id(packs) actions: list[SetupAction] = [] + issues: list[SetupIssue] = [] if not storage.ok or storage.configured_by == "default": + issues.append(SetupIssue( + id="storage", + title="Persistent storage is not ready", + severity="required", + reason="Large installs may be lost if NVH_HOME is still using the default or an unwritable path.", + fix_action_id="storage", + affected_item="NVH_HOME", + )) actions.append(SetupAction( id="storage", title="Choose persistent NVH_HOME", @@ -58,6 +178,14 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: )) if runtime.strategy == "needs-runtime": + issues.append(SetupIssue( + id="runtime-fallback", + title="Python runtime needs a fallback", + severity="recommended", + reason="This image does not appear to have a complete Python venv/pip path.", + fix_action_id="runtime-fallback", + affected_item="python-runtime-fallback", + )) actions.append(SetupAction( id="runtime-fallback", title="Install optional runtime fallback", @@ -69,6 +197,14 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: ollama_pack = by_pack.get("rootless-ollama", {}) if not ollama_pack.get("status", {}).get("installed"): + issues.append(SetupIssue( + id="rootless-ollama", + title="Local model runtime is missing", + severity="recommended", + reason="Local LLM downloads need a rootless Ollama runtime.", + fix_action_id="rootless-ollama", + affected_item="rootless-ollama", + )) actions.append(SetupAction( id="rootless-ollama", title="Install local model runtime", @@ -83,6 +219,14 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: if model.get("recommended") and not model.get("installed") ] if missing_models: + issues.append(SetupIssue( + id="starter-models", + title="Recommended local models are missing", + severity="recommended", + reason=f"{len(missing_models)} recommended model(s) are not installed yet.", + fix_action_id="starter-models", + affected_item="local-models", + )) actions.append(SetupAction( id="starter-models", title="Download recommended local models", @@ -93,6 +237,14 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: )) if not comfy.get("installed"): + issues.append(SetupIssue( + id="comfyui", + title="ComfyUI is not installed", + severity="optional", + reason="Visual image/video workflows are unavailable until ComfyUI is installed.", + fix_action_id="comfyui", + affected_item="comfyui", + )) actions.append(SetupAction( id="comfyui", title="Install ComfyUI visual workspace", @@ -102,6 +254,14 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: reason="ComfyUI enables local image/video workflows and nvHive starter examples.", )) elif not comfy.get("examples_installed"): + issues.append(SetupIssue( + id="comfyui-examples", + title="ComfyUI starter examples need repair", + severity="recommended", + reason="ComfyUI is present, but the nvHive example manifest is missing.", + fix_action_id="comfyui-examples", + affected_item="comfyui", + )) actions.append(SetupAction( id="comfyui-examples", title="Refresh ComfyUI examples", @@ -111,6 +271,38 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: reason="ComfyUI exists, but the nvHive examples manifest is missing.", )) + openclaw_pack = by_pack.get("openclaw-agent", {}) + nemoclaw_pack = by_pack.get("nemoclaw-sandbox", {}) + openclaw_status = openclaw_pack.get("status", {}) + nemoclaw_status = nemoclaw_pack.get("status", {}) + nemoclaw_details = nemoclaw_status.get("details", {}) + if not openclaw_status.get("installed"): + issues.append(SetupIssue( + id="claw-agents", + title="OpenClaw agent option is not installed", + severity="optional", + reason="OpenClaw gives students a self-hosted agent platform that can use local or cloud models.", + fix_action_id="claw-agents", + affected_item="openclaw-agent", + )) + actions.append(SetupAction( + id="claw-agents", + title="Install Claw agent options", + priority=65, + status="optional", + command="nvh studio --install claw -y", + reason="Adds OpenClaw, and adds NemoClaw too when Docker/OpenShell is usable without sudo.", + )) + elif nemoclaw_details.get("installable") and not nemoclaw_status.get("installed"): + actions.append(SetupAction( + id="claw-agents", + title="Add NemoClaw sandbox option", + priority=66, + status="optional", + command="nvh studio --install claw -y", + reason="Docker is reachable, so nvHive can add the NVIDIA NemoClaw/OpenShell path.", + )) + creative_pack = by_pack.get("blender-creative", {}) if not creative_pack.get("status", {}).get("installed"): actions.append(SetupAction( @@ -122,14 +314,277 @@ def setup_helper_report(home_dir: str | Path | None = None) -> dict[str, Any]: reason="Adds Blender LTS and asset workspaces for creative students.", )) + music_pack_ids = ("ace-step-music", "music-producer-lab", "music-daw-helper") + music_missing = [ + pack_id + for pack_id in music_pack_ids + if not by_pack.get(pack_id, {}).get("status", {}).get("installed") + ] + if music_missing: + actions.append(SetupAction( + id="music-tools", + title="Install music producer tools", + priority=72, + status="optional", + command="nvh studio --install music -y", + reason=f"Adds ACE-Step music generation, audio AI tools, and a rootless DAW workspace. Missing: {', '.join(music_missing)}.", + )) + + receipts = _safe_receipt_summary() + for receipt in receipts.get("receipts", []): + health = receipt.get("health", {}) + if not health.get("healthy", True): + action_id = f"repair-receipt:{receipt['id']}" + missing = len(health.get("missing_launchers", [])) + len(health.get("missing_files", [])) + issues.append(SetupIssue( + id=f"receipt:{receipt['id']}", + title=f"{receipt.get('title', receipt['id'])} needs repair", + severity="recommended", + reason=f"{missing or 1} expected file or launcher path is missing.", + fix_action_id=action_id, + affected_item=receipt["id"], + )) + try: + command = repair_plan(receipt["id"])["commands"][0] + except Exception: + command = f"nvh setup repair {receipt['id']}" + actions.append(SetupAction( + id=action_id, + title=f"Repair {receipt.get('title', receipt['id'])}", + priority=25, + status="recommended", + command=command, + reason="A previous rootless install receipt has missing files or launchers.", + )) + + catalog_data = _safe_catalog_data() + catalog_entries = _catalog_entry_by_id(catalog_data) + for receipt in receipts.get("receipts", []): + current_version = receipt.get("version") + latest_version = _version_from_catalog(catalog_entries.get(receipt.get("item_id"))) + if _looks_older(current_version, latest_version): + action_id = f"repair-receipt:{receipt['id']}" + issues.append(SetupIssue( + id=f"outdated:{receipt['id']}", + title=f"{receipt.get('title', receipt['id'])} has an update", + severity="recommended", + reason="A newer version is available in the setup catalog.", + fix_action_id=action_id, + affected_item=receipt["id"], + current_version=str(current_version), + available_version=str(latest_version), + )) + + failed_job = _recent_failed_job() + if failed_job: + action_id = _action_for_job(failed_job) + issues.append(SetupIssue( + id=f"job:{failed_job['id']}", + title=f"{failed_job.get('title', 'Install job')} needs attention", + severity="recommended", + reason=str(failed_job.get("message") or "A recent setup job did not finish."), + fix_action_id=action_id, + affected_item=failed_job.get("kind"), + )) + + compatibility = _safe_compatibility_report(home_dir=home_dir) + boot_preflight = _safe_boot_preflight(home_dir=home_dir) + for app in compatibility.get("apps", []): + if app.get("status") == "ready": + continue + action_id = app.get("recommended_action_id") + severity = "required" if app.get("status") == "blocked" else app.get("severity", "recommended") + issues.append(SetupIssue( + id=f"compat:{app['id']}", + title=f"{app.get('title', app['id'])} compatibility needs attention", + severity=severity, + reason=app.get("summary", "Compatibility check needs attention."), + fix_action_id=action_id, + affected_item=app["id"], + )) + + boot_changes = boot_preflight.get("changes") or [] + if boot_changes: + issues.append(SetupIssue( + id="boot:vm-image-changed", + title="Base VM image changed since the last nvHive boot", + severity="recommended", + reason=boot_preflight.get("summary", "Re-run the setup preflight before launching installed apps."), + fix_action_id=None, + affected_item="boot-preflight", + )) + actions.sort(key=lambda action: action.priority) + issues.sort(key=lambda issue: {"required": 0, "recommended": 1, "optional": 2}.get(issue.severity, 3)) ready = not any(action.status == "required" for action in actions) + agent_helper = boot_preflight.get("agent_helper") or {} return { "ready": ready, - "summary": "Ready for downloads" if ready else "Persistent storage needs attention", + "summary": ( + "Ready for downloads" + if ready and not issues + else f"{len(issues)} setup item(s) need attention" + ), "storage": storage.as_dict(), "runtime": runtime.as_dict(), "comfyui": comfy, "model_recommendation_count": len(missing_models), "actions": [action.as_dict() for action in actions], + "issues": [issue.as_dict() for issue in issues], + "issue_count": len(issues), + "receipts": receipts, + "catalog": _safe_catalog_status(), + "compatibility": { + "summary": compatibility.get("summary"), + "issue_count": compatibility.get("issue_count", 0), + "blocked_count": compatibility.get("blocked_count", 0), + "rootless_fixable_count": compatibility.get("rootless_fixable_count", 0), + "recommended_torch_profile": compatibility.get("recommended_torch_profile"), + }, + "boot_preflight": { + "summary": boot_preflight.get("summary"), + "checked_at": boot_preflight.get("checked_at"), + "changed": bool(boot_preflight.get("changed")), + "change_count": len(boot_changes), + "agent_helper": agent_helper, + }, + "assistant": { + "mode": agent_helper.get("mode", "offline-deterministic"), + "can_read_jobs": True, + "can_read_receipts": True, + "can_refresh_catalog": True, + "description": ( + "nvWizard is the rootless setup questmaster: it checks the GPU forge, " + "watches VM image drift, and suggests repairs without requiring a cloud model." + ), + }, + } + + +def _commands_for_actions(actions: list[dict[str, Any]], *action_ids: str) -> list[str]: + wanted = set(action_ids) + commands = [ + action["command"] for action in actions + if action.get("id") in wanted and action.get("command") + ] + return commands + + +def _persona_wrap(answer: str) -> str: + return f"nvWizard says: {answer}" + + +def setup_assistant_reply( + question: str, + home_dir: str | Path | None = None, +) -> dict[str, Any]: + """Answer a setup question using local state and deterministic rules.""" + report = setup_helper_report(home_dir=home_dir) + actions = report["actions"] + q = question.strip().lower() + receipts = report.get("receipts", {}) + failed_job = _recent_failed_job() + commands: list[str] = [] + focus = "next-step" + + if not q: + answer = "Ask about storage, ComfyUI, models, Blender, repair, or the next setup step." + elif any(word in q for word in ["storage", "mount", "persistent", "home", "nvh_home"]): + focus = "storage" + commands = _commands_for_actions(actions, "storage") or [ + 'nvh doctor --storage --home-dir "/path/on/mounted/volume/nvhive"', + ] + answer = ( + "Use the mounted file volume for NVH_HOME before large downloads. " + f"Current storage source is {report['storage']['configured_by']} at " + f"{report['storage']['layout']['home']}. The wizard should guide this with a folder picker; " + "the CLI command is only an advanced override." + ) + elif any(word in q for word in ["comfy", "image", "video", "workflow"]): + focus = "comfyui" + commands = _commands_for_actions(actions, "comfyui", "comfyui-examples") or [ + "nvh workstation --with-comfyui -y", + ] + answer = ( + "ComfyUI is managed as a rootless workspace under NVH_HOME. " + "Use the install button from the wizard; model weights stay explicit " + "because many upstream downloads require license acceptance." + ) + elif any(word in q for word in ["model", "llm", "ollama", "local ai"]): + focus = "models" + commands = _commands_for_actions(actions, "rootless-ollama", "starter-models") or [ + "nvh studio --install rootless-ollama -y", + "nvh studio --install-models recommended -y", + ] + answer = ( + "Start with the rootless Ollama runtime, then download the recommended models " + "that fit the detected GPU. The wizard can run both steps and keeps files under NVH_HOME/models." + ) + elif any(word in q for word in ["claw", "openclaw", "nemo", "nemoclaw", "desktop agent", "sandbox agent"]): + focus = "claw-agents" + commands = _commands_for_actions(actions, "claw-agents") or [ + "nvh studio --install claw -y", + ] + answer = ( + "OpenClaw is the simple rootless agent install. NemoClaw is the guarded NVIDIA/OpenShell " + "path and only lights up when Docker works without sudo. In the wizard, use the Claw Agents " + "pack; manual commands are just the advanced override." + ) + elif any(word in q for word in ["blender", "creative", "game", "asset"]): + focus = "creative" + commands = _commands_for_actions(actions, "creative-tools") or [ + "nvh studio --install creative -y", + ] + answer = ( + "Creative tools are installed without sudo under NVH_HOME/apps and NVH_HOME/studio. " + "Use the creative profile or repair button; manual commands are just overrides." + ) + elif any(word in q for word in ["repair", "fix", "failed", "error", "broken"]): + focus = "repair" + if failed_job: + answer = ( + f"The most recent problem I found is {failed_job['title']} with status " + f"{failed_job['status']}: {failed_job.get('message', 'no message')}. " + "Use the matching repair/install button after checking storage and network access." + ) + elif receipts.get("unhealthy"): + first = receipts.get("receipts", [{}])[0] + try: + commands = repair_plan(first["id"])["commands"] + except Exception: + commands = [] + answer = ( + f"I found {receipts['unhealthy']} receipt(s) with missing files or launchers. " + "Use the repair command to rerun the rootless installer for that item." + ) + else: + answer = ( + "I do not see a failed recent install or unhealthy receipt. " + "Run the wizard step again if you want to refresh an installed component." + ) + else: + commands = [action["command"] for action in actions[:3]] + next_title = actions[0]["title"] if actions else "Open the setup wizard" + answer = ( + f"Best next step: {next_title}. " + f"{report['summary']}. Receipts tracked: {receipts.get('count', 0)}." + ) + + if not commands and actions: + commands = [actions[0]["command"]] + + return { + "question": question, + "answer": _persona_wrap(answer), + "focus": focus, + "commands": commands, + "observations": { + "ready": report["ready"], + "issue_count": report.get("issue_count", 0), + "receipt_count": receipts.get("count", 0), + "unhealthy_receipts": receipts.get("unhealthy", 0), + "catalog_source": report.get("catalog", {}).get("source"), + "recent_problem": failed_job, + }, + "actions": actions[:5], } diff --git a/nvh/integrations/smoke_tests.py b/nvh/integrations/smoke_tests.py new file mode 100644 index 0000000..237595d --- /dev/null +++ b/nvh/integrations/smoke_tests.py @@ -0,0 +1,133 @@ +"""Lightweight smoke tests for rootless nvHive apps.""" + +from __future__ import annotations + +import socket +from dataclasses import asdict, dataclass +from pathlib import Path +from typing import Any + +from nvh.integrations.comfyui import detect_comfyui +from nvh.integrations.storage import storage_status +from nvh.integrations.studio_packs import catalog_with_status + + +@dataclass(frozen=True) +class SmokeTest: + id: str + title: str + status: str + summary: str + detail: str = "" + action_id: str | None = None + + def as_dict(self) -> dict[str, Any]: + return asdict(self) + + +def _port_open(port: int, host: str = "127.0.0.1") -> bool: + try: + with socket.create_connection((host, port), timeout=0.5): + return True + except OSError: + return False + + +def _pack_installed(pack_id: str, packs: list[dict[str, Any]]) -> bool: + for pack in packs: + if pack.get("id") == pack_id: + return bool(pack.get("status", {}).get("installed")) + return False + + +def _pack_details(pack_id: str, packs: list[dict[str, Any]]) -> dict[str, Any]: + for pack in packs: + if pack.get("id") == pack_id: + return pack.get("status", {}).get("details", {}) + return {} + + +def smoke_test_report(home_dir: str | None = None) -> dict[str, Any]: + """Return non-destructive app health checks.""" + storage = storage_status(home_dir=home_dir) + packs = catalog_with_status().get("packs", []) + comfy = detect_comfyui() + tests = [ + SmokeTest( + id="storage", + title="Persistent storage", + status="pass" if storage.ok and storage.configured_by != "default" else "warn", + summary="NVH_HOME is ready" if storage.ok else "NVH_HOME needs attention", + detail=str(storage.layout.home), + action_id=None if storage.ok and storage.configured_by != "default" else "storage", + ), + SmokeTest( + id="env-file", + title="Session env file", + status="pass" if Path(storage.env_file).exists() else "warn", + summary="Shell activation file exists" if Path(storage.env_file).exists() else "Shell activation file is missing", + detail=str(storage.env_file), + action_id="repair-workspace", + ), + SmokeTest( + id="ollama", + title="Ollama local model server", + status="pass" if _port_open(11434) else "warn", + summary="Ollama is responding" if _port_open(11434) else "Ollama is not responding yet", + detail="http://127.0.0.1:11434", + action_id="rootless-ollama", + ), + SmokeTest( + id="agent-lab", + title="Local Agent Lab", + status="pass" if _pack_installed("agent-lab", packs) else "warn", + summary="Local agent helper pack is installed" if _pack_installed("agent-lab", packs) else "Local Agent Lab is not installed", + action_id="agent-lab", + ), + SmokeTest( + id="claw-agents", + title="Claw agent options", + status="pass" if _pack_installed("openclaw-agent", packs) else "warn", + summary=( + "OpenClaw is installed" + if _pack_installed("openclaw-agent", packs) + else "OpenClaw can be installed; NemoClaw requires Docker/OpenShell access" + ), + detail=str(_pack_details("nemoclaw-sandbox", packs).get("blocked_reason", "")), + action_id="claw-agents", + ), + SmokeTest( + id="comfyui", + title="ComfyUI workspace", + status="pass" if comfy.get("running") else "warn" if comfy.get("installed") else "skip", + summary="ComfyUI is running" if comfy.get("running") else "ComfyUI is installed but not running" if comfy.get("installed") else "ComfyUI is optional and not installed", + detail=str(comfy.get("app_dir", "")), + action_id="comfyui", + ), + SmokeTest( + id="comfyui-examples", + title="ComfyUI starter examples", + status="pass" if comfy.get("examples_installed") else "warn" if comfy.get("installed") else "skip", + summary="Starter examples are installed" if comfy.get("examples_installed") else "Starter examples can be repaired" if comfy.get("installed") else "Install ComfyUI first", + detail=str(comfy.get("examples_dir", "")), + action_id="comfyui-examples", + ), + SmokeTest( + id="blender", + title="Blender creative tools", + status="pass" if _pack_installed("blender-creative", packs) else "skip", + summary="Blender pack is installed" if _pack_installed("blender-creative", packs) else "Blender is optional and not installed", + action_id="creative-tools", + ), + ] + failed = sum(1 for test in tests if test.status == "fail") + warnings = sum(1 for test in tests if test.status == "warn") + passed = sum(1 for test in tests if test.status == "pass") + return { + "summary": f"{passed} passed, {warnings} warning(s), {failed} failed", + "ready": failed == 0, + "passed": passed, + "warnings": warnings, + "failed": failed, + "tests": [test.as_dict() for test in tests], + } diff --git a/nvh/integrations/storage.py b/nvh/integrations/storage.py index a57eb46..c2cbed3 100644 --- a/nvh/integrations/storage.py +++ b/nvh/integrations/storage.py @@ -76,6 +76,8 @@ class StorageStatus: configured_by: str exists: bool writable: bool + write_probe_ok: bool + write_probe_error: str free_gb: float | None total_gb: float | None min_free_gb: float @@ -178,6 +180,23 @@ def _disk_usage_gb(path: Path) -> tuple[float | None, float | None]: return round(usage.free / gb, 1), round(usage.total / gb, 1) +def _write_probe(path: Path) -> tuple[bool, str]: + probe_file = path / f".nvh-write-probe-{os.getpid()}" + try: + with probe_file.open("wb") as handle: + handle.write(b"nvhive storage probe\n") + handle.flush() + os.fsync(handle.fileno()) + probe_file.unlink(missing_ok=True) + return True, "" + except Exception as exc: + try: + probe_file.unlink(missing_ok=True) + except Exception: + pass + return False, str(exc) + + def _looks_ephemeral(path: Path) -> bool: parts = {part.lower() for part in path.parts} return bool(parts.intersection({"tmp", "temp", "run", "var", "cache"})) @@ -255,6 +274,8 @@ def storage_status( probe = home if exists else _nearest_existing_parent(home) free_gb, total_gb = _disk_usage_gb(probe) writable = False + write_probe_ok = False + write_probe_error = "" warnings: list[str] = [] try: @@ -262,17 +283,19 @@ def storage_status( writable = os.access(parent, os.W_OK) except Exception: writable = False + write_probe_ok, write_probe_error = _write_probe(home if exists else probe) + writable = write_probe_ok if configured_by == "default": warnings.append( - "NVH_HOME is not set. On ephemeral cloud desktops, point it at the mounted persistent volume." + "NVH_HOME is not set. On ephemeral cloud desktops, use the persistent block-backed home/data volume; ~/.nvh is only safe when $HOME itself is that volume." ) if _looks_ephemeral(home): warnings.append("Selected storage path looks ephemeral; use a mounted persistent directory instead.") if free_gb is not None and free_gb < min_free_gb: warnings.append(f"Only {free_gb} GB free; recommended minimum is {min_free_gb:.0f} GB.") if not writable: - warnings.append("Selected storage path is not writable by this user.") + warnings.append("Selected storage path failed a real write/fsync/delete probe.") ok = writable and (free_gb is None or free_gb >= min_free_gb) return StorageStatus( @@ -280,6 +303,8 @@ def storage_status( configured_by=configured_by, exists=exists, writable=writable, + write_probe_ok=write_probe_ok, + write_probe_error=write_probe_error, free_gb=free_gb, total_gb=total_gb, min_free_gb=min_free_gb, diff --git a/nvh/integrations/studio_packs.py b/nvh/integrations/studio_packs.py index a7590b0..f2b98fd 100644 --- a/nvh/integrations/studio_packs.py +++ b/nvh/integrations/studio_packs.py @@ -2,7 +2,8 @@ Packs are intentionally user-space only: files, launchers, models, and caches go under ``NVH_HOME``. The installer never calls sudo, apt, -dnf, pacman, systemctl, or Docker. +dnf, pacman, or systemctl. Container-backed packs only run when a provider +already exposes Docker without sudo. """ from __future__ import annotations @@ -11,6 +12,7 @@ import json import os import platform +import re import shutil import socket import stat @@ -19,12 +21,14 @@ import tarfile import tempfile import time +import zipfile from collections.abc import AsyncIterator from dataclasses import asdict, dataclass from pathlib import Path from typing import Any from nvh.integrations.storage import storage_layout +from nvh.utils.gpu import detect_gpus OLLAMA_PORT = 11434 BLENDER_VERSION = "4.5.4" @@ -33,6 +37,33 @@ "https://download.blender.org/release/Blender4.5/" f"blender-{BLENDER_VERSION}-linux-x64.tar.xz" ) +NODE_MAJOR_VERSION = "22" +NODE_MIN_VERSION = (22, 16, 0) +NPM_MIN_VERSION = (10, 0, 0) +OPENCLAW_PACKAGE = "openclaw@latest" +OPENCLAW_DOC_URL = "https://openclawdoc.com/docs/getting-started/installation/" +NEMOCLAW_INSTALL_URL = "https://www.nvidia.com/nemoclaw.sh" +NEMOCLAW_DOC_URL = "https://docs.nvidia.com/nemoclaw/latest/get-started/quickstart.html" +NEMOCLAW_PACKAGE = "nemoclaw@latest" +NVIDIA_OMNI_BLOG_URL = ( + "https://blogs.nvidia.com/blog/nemotron-3-nano-omni-multimodal-ai-agents/" + "?nvid=nv-int-csfg-551280" +) +NVIDIA_OMNI_TECH_BLOG_URL = ( + "https://developer.nvidia.com/blog/" + "nvidia-nemotron-3-nano-omni-powers-multimodal-agent-reasoning-in-a-single-efficient-open-model" +) +NVIDIA_OMNI_HF_URL = ( + "https://huggingface.co/nvidia/" + "Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16" +) +NVIDIA_BUILD_URL = "https://build.nvidia.com/" +GODOT_RELEASE_API = "https://api.github.com/repos/godotengine/godot/releases/latest" +GODOT_DOC_URL = "https://docs.godotengine.org/en/stable/" +ACE_STEP_REPO_URL = "https://github.com/ACE-Step/ACE-Step-1.5.git" +ACE_STEP_DOC_URL = "https://github.com/ACE-Step/ACE-Step-1.5/blob/main/docs/en/INSTALL.md" +AUDACITY_RELEASE_API = "https://api.github.com/repos/audacity/audacity/releases/latest" +LMMS_RELEASE_API = "https://api.github.com/repos/LMMS/lmms/releases/latest" @dataclass(frozen=True) @@ -171,7 +202,7 @@ class StudioModel: estimated_disk_gb=4.5, priority=70, capabilities=["vision", "image Q&A", "desktop screenshots"], - why_recommended="Adds local image understanding for screenshots and classroom media.", + why_recommended="Adds local image understanding for screenshots and creative media.", source_url="https://ollama.com/library/llava", license_note="Ollama library terms apply.", ), @@ -283,7 +314,7 @@ class StudioModel: tagline="LangGraph, CrewAI, AutoGen, tools, and notebooks", description=( "Creates a dedicated Python environment for local agents, tool calling, " - "search helpers, and classroom automation experiments." + "search helpers, and student automation experiments." ), recommended_vram_gb=0, estimated_disk_gb=2.5, @@ -314,6 +345,88 @@ class StudioModel: "This pack gives the local AI agent layer a ready Python home.", ], ), + StudioPack( + id="nvidia-omni-agent", + title="NVIDIA Omni Agent", + category="agents", + tagline="Optional multimodal Nemotron 3 Nano Omni upgrade for AI Starter", + description=( + "Adds an NVIDIA Omni Agent workspace that routes first to NVIDIA NIM/build.nvidia.com " + "and only recommends local Nemotron 3 Nano Omni weights when GPU VRAM and persistent " + "storage are large enough." + ), + recommended_vram_gb=24, + estimated_disk_gb=0.2, + install_kind="scaffold", + no_root=True, + models=[ + "nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16", + "nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-FP8", + "nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4", + ], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-omni-agent"], + source_urls=[ + NVIDIA_OMNI_BLOG_URL, + NVIDIA_OMNI_TECH_BLOG_URL, + NVIDIA_OMNI_HF_URL, + NVIDIA_BUILD_URL, + ], + notes=[ + "AI Starter installs this as a lightweight guide and launcher, not a default model download.", + "Use NVIDIA NIM/build.nvidia.com first on smaller student VMs.", + "Local BF16 weights are roughly 61.5 GB; FP8 is roughly 32.8 GB; NVFP4 is roughly 20.9 GB.", + "nvWizard should require persistent storage headroom before recommending local weights.", + ], + ), + StudioPack( + id="openclaw-agent", + title="OpenClaw Agent Workspace", + category="claw", + tagline="Self-hosted agent platform with local model support", + description=( + "Installs OpenClaw into a persistent user-space Node workspace, adds a " + "launcher, and keeps agent state under NVH_HOME/studio instead of the base OS." + ), + recommended_vram_gb=0, + estimated_disk_gb=2.0, + install_kind="openclaw_agent", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-openclaw"], + source_urls=[OPENCLAW_DOC_URL, "https://openclaw.ai/install.sh"], + notes=[ + "Requires Node.js 22.16+ and npm 10+; nvHive can install a rootless Node runtime on Linux.", + "Use with local Ollama models or a configured cloud provider.", + ], + ), + StudioPack( + id="nemoclaw-sandbox", + title="NVIDIA NemoClaw Sandbox", + category="claw", + tagline="OpenClaw inside NVIDIA OpenShell guardrails", + description=( + "Adds NVIDIA NemoClaw as the guarded OpenClaw path when the host exposes " + "a Docker runtime that works without sudo. The wizard blocks this pack on " + "locked-down sessions that cannot run containers." + ), + recommended_vram_gb=0, + estimated_disk_gb=40.0, + install_kind="nemoclaw_sandbox", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-nemoclaw"], + source_urls=[NEMOCLAW_DOC_URL, NEMOCLAW_INSTALL_URL], + notes=[ + "NemoClaw is alpha software and requires a usable Docker/OpenShell path.", + "Recommended only when the cloud image grants rootless Docker or docker group access.", + ], + ), StudioPack( id="comfyui-power-nodes", title="ComfyUI Power Nodes", @@ -355,7 +468,7 @@ class StudioModel: tagline="Pygame, Panda3D, assets, and modding helpers", description=( "Creates a no-root Python game development environment for AI-assisted " - "prototypes, texture generation workflows, and classroom game projects." + "prototypes, texture generation workflows, and personal game projects." ), recommended_vram_gb=0, estimated_disk_gb=2.0, @@ -408,6 +521,104 @@ class StudioModel: "The helper creates structure and docs; it does not bypass anti-cheat or DRM.", ], ), + StudioPack( + id="godot-engine", + title="Godot Engine", + category="game", + tagline="Open-source game engine as a rootless app", + description=( + "Downloads the latest official Godot Linux x86_64 release into NVH_HOME/apps, " + "adds a persistent launcher, and creates a project folder beside the rest of the lab." + ), + recommended_vram_gb=2, + estimated_disk_gb=0.4, + install_kind="godot_app", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-godot"], + source_urls=[GODOT_RELEASE_API, GODOT_DOC_URL], + notes=[ + "Uses the official GitHub release asset selected at install time.", + "Godot projects stay under persistent storage and can use Blender or ComfyUI assets.", + ], + ), + StudioPack( + id="unity-hub-helper", + title="Unity Hub Helper", + category="game", + tagline="Persistent Unity workspace and account handoff", + description=( + "Creates a rootless Unity workspace with launcher notes for Unity Hub AppImage/manual " + "installs. The wizard keeps the storage and cache paths ready, while Unity handles sign-in." + ), + recommended_vram_gb=6, + estimated_disk_gb=12.0, + install_kind="scaffold", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-unity-hub"], + source_urls=["https://unity.com/download"], + notes=[ + "Unity requires a Unity account and license acceptance.", + "Use the helper to keep projects and downloaded editors on the persistent block volume.", + ], + ), + StudioPack( + id="unreal-engine-helper", + title="Unreal Engine Helper", + category="game", + tagline="Epic/GitHub prep for a large rootless UE workspace", + description=( + "Creates the persistent Unreal workspace, explains the Epic-to-GitHub account link, " + "and prepares folders for source builds or provider-supplied Unreal installs." + ), + recommended_vram_gb=8, + estimated_disk_gb=150.0, + install_kind="scaffold", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-unreal-helper"], + source_urls=[ + "https://www.unrealengine.com/en-US/download", + "https://www.unrealengine.com/en-US/ue-on-github", + ], + notes=[ + "Unreal access requires an Epic account and linked GitHub account.", + "Large Unreal source/editor builds can exceed 150 GB; nvWizard should reserve the block volume first.", + ], + ), + StudioPack( + id="github-login-helper", + title="GitHub Connect", + category="connector", + tagline="Simple GitHub login helper for cloning and PR work", + description=( + "Adds a rootless GitHub login workspace and launcher that uses GitHub CLI when present " + "or a GITHUB_TOKEN fallback for cloud images without system package access." + ), + recommended_vram_gb=0, + estimated_disk_gb=0.1, + install_kind="scaffold", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-github-login"], + source_urls=[ + "https://cli.github.com/manual/gh_auth_login", + "https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens", + ], + notes=[ + "The helper never stores a password. Prefer GitHub CLI browser login or a fine-grained token.", + "Public repositories can still clone over HTTPS without login.", + ], + ), StudioPack( id="blender-creative", title="Blender Creative Studio", @@ -435,25 +646,138 @@ class StudioModel: "Cycles GPU rendering still depends on the NVIDIA driver exposed by the cloud image.", ], ), + StudioPack( + id="ace-step-music", + title="ACE-Step Music Generator", + category="music", + tagline="Local AI songs, loops, lyrics, and remixes", + description=( + "Clones ACE-Step 1.5 into persistent storage, prepares a rootless uv " + "environment, and adds a launcher for the local Gradio music studio." + ), + recommended_vram_gb=6, + estimated_disk_gb=12.0, + install_kind="ace_step_music", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-ace-step"], + source_urls=[ACE_STEP_REPO_URL, ACE_STEP_DOC_URL], + notes=[ + "ACE-Step models download on first launch and can use several additional GB.", + "Runs from the persistent block volume; no apt, sudo, or system Python edits.", + ], + ), + StudioPack( + id="music-producer-lab", + title="Music Producer AI Lab", + category="music", + tagline="Stem splitting, transcription, audio generation, and notebooks", + description=( + "Creates a rootless Python audio lab for GPU-backed source separation, " + "lyrics/transcription, prompt-to-audio experiments, and batch audio tools." + ), + recommended_vram_gb=8, + estimated_disk_gb=8.0, + install_kind="python_venv", + no_root=True, + models=[], + python_packages=[ + "demucs", + "whisperx", + "faster-whisper", + "stable-audio-tools", + "audio-separator", + "librosa", + "soundfile", + "gradio", + "huggingface_hub", + "jupyterlab", + ], + comfy_nodes=[], + launchers=["nvhive-music-lab"], + source_urls=[ + "https://github.com/m-bain/whisperX", + "https://github.com/Stability-AI/stable-audio-tools", + "https://docs.pytorch.org/audio/stable/tutorials/hybrid_demucs_tutorial.html", + ], + notes=[ + "CUDA acceleration depends on the PyTorch wheels that match the host driver.", + "Use this for remixing and cleanup; use ACE-Step for full music generation.", + ], + ), + StudioPack( + id="music-daw-helper", + title="Rootless DAW Helper", + category="music", + tagline="Audacity and LMMS AppImages plus DAW workspace", + description=( + "Downloads official Audacity and LMMS AppImages when available, " + "then creates a persistent music production workspace with launch helpers." + ), + recommended_vram_gb=0, + estimated_disk_gb=1.0, + install_kind="scaffold", + no_root=True, + models=[], + python_packages=[], + comfy_nodes=[], + launchers=["nvhive-music-studio"], + source_urls=[ + AUDACITY_RELEASE_API, + LMMS_RELEASE_API, + "https://support.audacityteam.org/basics/downloading-and-installing-audacity", + "https://lmms.io/download", + "https://www.reaper.fm/download.php", + "https://musescore.org/en/download", + ], + notes=[ + "Desktop apps remain in user space and can use AppImage extract-and-run when FUSE is unavailable.", + "Commercial DAWs may require account login or license acceptance outside nvHive.", + ], + ), ] PACK_BUNDLES: dict[str, list[str]] = { - "starter": ["rootless-ollama", "llm-starter", "agent-lab", "comfyui-power-nodes", "game-dev-lab"], + "starter": [ + "rootless-ollama", + "llm-starter", + "agent-lab", + "nvidia-omni-agent", + "comfyui-power-nodes", + "game-dev-lab", + "github-login-helper", + ], "llms": ["rootless-ollama", "llm-starter", "llm-coder-reasoner"], - "agents": ["agent-lab"], + "agents": ["agent-lab", "nvidia-omni-agent", "openclaw-agent", "github-login-helper"], + "claw": ["openclaw-agent", "nemoclaw-sandbox"], + "omni": ["nvidia-omni-agent"], "comfy": ["comfyui-power-nodes"], - "game": ["game-dev-lab", "game-mod-helper"], - "creative": ["blender-creative", "game-dev-lab", "game-mod-helper"], + "connectors": ["github-login-helper"], + "music": ["ace-step-music", "music-producer-lab", "music-daw-helper", "github-login-helper"], + "game": ["game-dev-lab", "game-mod-helper", "godot-engine", "unity-hub-helper", "unreal-engine-helper", "github-login-helper"], + "creative": ["blender-creative", "game-dev-lab", "game-mod-helper", "godot-engine"], "all": [ "rootless-ollama", "llm-starter", "llm-coder-reasoner", "agent-lab", + "nvidia-omni-agent", + "openclaw-agent", + "nemoclaw-sandbox", "comfyui-power-nodes", "game-dev-lab", "game-mod-helper", + "godot-engine", + "unity-hub-helper", + "unreal-engine-helper", + "github-login-helper", "blender-creative", + "ace-step-music", + "music-producer-lab", + "music-daw-helper", ], } @@ -516,6 +840,269 @@ def _blender_binary() -> Path: return _blender_app_dir() / "blender" +def _godot_root() -> Path: + return storage_layout().apps_dir / "godot" + + +def _godot_current_file() -> Path: + return _godot_root() / "current.json" + + +def _godot_binary_from_state() -> Path | None: + state_file = _godot_current_file() + if not state_file.exists(): + return None + try: + state = json.loads(state_file.read_text(encoding="utf-8")) + except Exception: + return None + binary = state.get("binary") + if not isinstance(binary, str): + return None + path = Path(binary) + return path if path.exists() else None + + +def _ace_step_root() -> Path: + return _pack_root("ace-step-music") + + +def _ace_step_app_dir() -> Path: + return _ace_step_root() / "ACE-Step-1.5" + + +def _ace_step_uv_venv_python() -> Path: + venv = _ace_step_root() / "uv-venv" + if os.name == "nt": + return venv / "Scripts" / "python.exe" + return venv / "bin" / "python" + + +def _ace_step_uv_binary() -> Path: + venv = _ace_step_root() / "uv-venv" + if os.name == "nt": + return venv / "Scripts" / "uv.exe" + return venv / "bin" / "uv" + + +def _node_runtime_root() -> Path: + return storage_layout().runtime_dir / "node" + + +def _fnm_root() -> Path: + return storage_layout().runtime_dir / "fnm" + + +def _openclaw_workspace() -> Path: + return _pack_root("openclaw-agent") / "workspace" + + +def _openclaw_prefix() -> Path: + return _pack_root("openclaw-agent") / "node" + + +def _openclaw_binary() -> Path: + suffix = ".cmd" if os.name == "nt" else "" + return _openclaw_prefix() / "bin" / f"openclaw{suffix}" + + +def _nemoclaw_workspace() -> Path: + return _pack_root("nemoclaw-sandbox") / "workspace" + + +def _nemoclaw_prefix() -> Path: + return _pack_root("nemoclaw-sandbox") / "node" + + +def _nemoclaw_binary_from_env(env: dict[str, str] | None = None) -> str: + suffix = ".cmd" if os.name == "nt" else "" + candidates = [ + _nemoclaw_prefix() / "bin" / f"nemoclaw{suffix}", + _local_bin() / "nemoclaw", + _pack_root("nemoclaw-sandbox") / "home" / ".local" / "bin" / "nemoclaw", + _pack_root("nemoclaw-sandbox") / "home" / ".npm-global" / "bin" / "nemoclaw", + ] + for candidate in candidates: + if candidate.exists(): + return str(candidate) + found = shutil.which("nemoclaw", path=env.get("PATH") if env else None) + return found or "" + + +def _parse_semver(value: str | None) -> tuple[int, int, int] | None: + if not value: + return None + match = re.search(r"(\d+)(?:\.(\d+))?(?:\.(\d+))?", value) + if not match: + return None + return tuple(int(part or 0) for part in match.groups()) # type: ignore[return-value] + + +def _semver_at_least(value: tuple[int, int, int] | None, minimum: tuple[int, int, int]) -> bool: + return bool(value and value >= minimum) + + +def _run_capture(cmd: list[str], *, env: dict[str, str] | None = None, timeout: float = 8.0) -> str: + try: + result = subprocess.run( + cmd, + capture_output=True, + text=True, + timeout=timeout, + env=env, + ) + except Exception: + return "" + return ((result.stdout or result.stderr) or "").strip().splitlines()[0] if (result.stdout or result.stderr) else "" + + +def _find_rootless_node_bin() -> Path | None: + root = _fnm_root() / "node-versions" + if not root.exists(): + return None + installs = sorted(root.glob(f"v{NODE_MAJOR_VERSION}.*/installation/bin"), reverse=True) + for install in installs: + if (install / "node").exists() and (install / "npm").exists(): + return install + return None + + +def _node_env(extra: dict[str, str] | None = None) -> dict[str, str]: + env = os.environ.copy() + env.update(storage_layout().env()) + rootless_bin = _find_rootless_node_bin() + path_parts = [ + str(_local_bin()), + str(_openclaw_prefix() / "bin"), + str(_nemoclaw_prefix() / "bin"), + str(_pack_root("nemoclaw-sandbox") / "home" / ".local" / "bin"), + str(_pack_root("nemoclaw-sandbox") / "home" / ".npm-global" / "bin"), + ] + if rootless_bin: + path_parts.insert(0, str(rootless_bin)) + env["PATH"] = os.pathsep.join(path_parts + [env.get("PATH", "")]) + env["NPM_CONFIG_PREFIX"] = str(_openclaw_prefix()) + env["OPENCLAW_HOME"] = str(_openclaw_workspace()) + env["NEMOCLAW_WORKSPACE"] = str(_nemoclaw_workspace()) + if extra: + env.update(extra) + return env + + +def _node_runtime_status(env: dict[str, str] | None = None) -> dict[str, Any]: + env = env or _node_env() + node = shutil.which("node", path=env.get("PATH")) + npm = shutil.which("npm", path=env.get("PATH")) + node_text = _run_capture([node, "--version"], env=env) if node else "" + npm_text = _run_capture([npm, "--version"], env=env) if npm else "" + node_version = _parse_semver(node_text) + npm_version = _parse_semver(npm_text) + node_ok = _semver_at_least(node_version, NODE_MIN_VERSION) + npm_ok = _semver_at_least(npm_version, NPM_MIN_VERSION) + can_auto_install = ( + platform.system() == "Linux" + and bool(shutil.which("bash")) + and bool(shutil.which("curl")) + ) + return { + "node": node or "", + "npm": npm or "", + "node_version": node_text, + "npm_version": npm_text, + "node_ok": node_ok, + "npm_ok": npm_ok, + "ready": node_ok and npm_ok, + "can_auto_install": can_auto_install, + "minimum_node": ".".join(str(part) for part in NODE_MIN_VERSION), + "minimum_npm": ".".join(str(part) for part in NPM_MIN_VERSION), + } + + +def _docker_status() -> dict[str, Any]: + docker = shutil.which("docker") + if not docker: + return { + "binary": "", + "ready": False, + "detail": "Docker was not found on PATH.", + "rootless_hint": "NemoClaw needs Docker or a provider-enabled rootless container runtime.", + } + try: + result = subprocess.run( + [docker, "info"], + capture_output=True, + text=True, + timeout=10, + ) + except Exception as exc: + return { + "binary": docker, + "ready": False, + "detail": f"Docker could not be checked: {exc}", + "rootless_hint": "Ask the provider to enable rootless Docker or docker group access.", + } + if result.returncode == 0: + return { + "binary": docker, + "ready": True, + "detail": "Docker daemon is reachable without sudo.", + "rootless_hint": "", + } + detail = (result.stderr or result.stdout or "Docker daemon is not reachable.").strip().splitlines()[0] + return { + "binary": docker, + "ready": False, + "detail": detail, + "rootless_hint": "NemoClaw is blocked until Docker works without sudo in this session.", + } + + +def _prepare_node_runtime() -> tuple[dict[str, str], dict[str, Any]]: + env = _node_env() + status = _node_runtime_status(env) + if status["ready"]: + return env, status + if not status["can_auto_install"]: + raise RuntimeError( + "OpenClaw needs Node.js 22.16+ and npm 10+. This host cannot auto-install " + "the rootless Node runtime because Linux, bash, and curl are not all available." + ) + + fnm_dir = _fnm_root() + fnm_dir.mkdir(parents=True, exist_ok=True) + install_env = os.environ.copy() + install_env.update(storage_layout().env()) + install_env["FNM_DIR"] = str(fnm_dir) + install_env["NODE_VERSION"] = NODE_MAJOR_VERSION + subprocess.run( + ["bash", "-lc", "curl -fsSL https://fnm.vercel.app/install | bash -s -- --skip-shell"], + check=True, + timeout=180, + env=install_env, + ) + + fnm = fnm_dir / "fnm" + if not fnm.exists(): + fnm = fnm_dir / "fnm.exe" + if not fnm.exists(): + raise RuntimeError("Rootless Node install finished, but fnm was not found.") + subprocess.run( + [str(fnm), "install", NODE_MAJOR_VERSION], + check=True, + timeout=300, + env=install_env, + ) + + env = _node_env() + status = _node_runtime_status(env) + if not status["ready"]: + raise RuntimeError( + f"Node runtime is still not ready. Node={status['node_version'] or 'missing'} " + f"npm={status['npm_version'] or 'missing'}." + ) + return env, status + + def _find_pack(pack_id: str) -> StudioPack: for pack in STUDIO_PACKS: if pack.id == pack_id: @@ -552,11 +1139,22 @@ def catalog_as_dicts() -> list[dict[str, Any]]: return [asdict(pack) for pack in STUDIO_PACKS] +def model_catalog_as_dicts() -> list[dict[str, Any]]: + return [asdict(model) for model in STUDIO_MODELS] + + def bundles_as_dict() -> dict[str, list[str]]: return {key: list(value) for key, value in PACK_BUNDLES.items()} def _detect_vram_gb() -> int: + try: + gpus = detect_gpus() + except Exception: + gpus = [] + if gpus: + return int(sum(gpu.vram_mb for gpu in gpus) // 1024) + nvidia_smi = shutil.which("nvidia-smi") if not nvidia_smi: return 0 @@ -729,6 +1327,50 @@ def pack_status(pack: StudioPack) -> dict[str, Any]: elif pack.install_kind == "python_venv": installed = _venv_python(pack.id).exists() and marker is not None details["venv"] = str(_venv_python(pack.id).parent.parent) + elif pack.install_kind == "ace_step_music": + app_dir = _ace_step_app_dir() + uv_binary = _ace_step_uv_binary() + installed = app_dir.exists() and uv_binary.exists() and marker is not None + details["app_dir"] = str(app_dir) + details["uv"] = str(uv_binary) + details["launcher"] = str(_local_bin() / "nvhive-ace-step") + details["installable"] = platform.system().lower() == "linux" and shutil.which("git") is not None + if platform.system().lower() != "linux": + details["blocked_reason"] = "ACE-Step music pack targets Linux cloud desktops." + elif not details["installable"]: + details["blocked_reason"] = "ACE-Step needs git to clone the official repository into persistent storage." + elif pack.install_kind == "openclaw_agent": + node = _node_runtime_status() + binary = _openclaw_binary() + installed = binary.exists() or marker is not None + installable = bool(node["ready"] or node["can_auto_install"]) + details.update(node) + details["binary"] = str(binary) + details["workspace"] = str(_openclaw_workspace()) + details["installable"] = installable + if not installable: + details["blocked_reason"] = "OpenClaw needs Node.js 22.16+ and npm 10+, or a Linux host where nvHive can install Node rootlessly." + elif pack.install_kind == "nemoclaw_sandbox": + node = _node_runtime_status() + docker = _docker_status() + binary = _nemoclaw_binary_from_env(_node_env()) + installed = bool(binary) or marker is not None + installable = bool(docker["ready"] and (node["ready"] or node["can_auto_install"])) + details.update({ + "node": node, + "docker": docker, + "binary": binary, + "workspace": str(_nemoclaw_workspace()), + "installable": installable, + "alpha": True, + "estimated_min_disk_gb": 20, + "estimated_recommended_disk_gb": 40, + "recommended_ram_gb": 16, + }) + if not docker["ready"]: + details["blocked_reason"] = "NemoClaw needs Docker/OpenShell access that works without sudo; use OpenClaw or ask the provider to enable rootless Docker." + elif not installable: + details["blocked_reason"] = "NemoClaw needs Node.js 22.16+ and npm 10+." elif pack.install_kind == "comfy_nodes": custom_nodes = _comfyui_app_dir() / "custom_nodes" missing_nodes = [node.name for node in pack.comfy_nodes if not (custom_nodes / node.name).exists()] @@ -738,12 +1380,48 @@ def pack_status(pack: StudioPack) -> dict[str, Any]: elif pack.install_kind == "scaffold": installed = marker is not None details["workspace"] = str(_pack_root(pack.id)) + if pack.id == "nvidia-omni-agent": + vram_gb = _detect_vram_gb() + layout = storage_layout() + min_local_gb = 70.0 + free_gb = None + try: + usage = shutil.disk_usage(layout.home) + free_gb = round(usage.free / (1024**3), 1) + except Exception: + pass + local_ok = bool( + vram_gb >= pack.recommended_vram_gb + and free_gb is not None + and free_gb >= min_local_gb + ) + details.update({ + "nim_recommended": True, + "local_recommended": local_ok, + "detected_vram_gb": vram_gb, + "free_gb": free_gb, + "min_local_free_gb": min_local_gb, + "model_sizes_gb": {"BF16": 61.5, "FP8": 32.8, "NVFP4": 20.9}, + "recommended_path": "local" if local_ok else "nvidia-nim", + }) + if pack.id == "music-daw-helper": + appimages = sorted((_pack_root(pack.id) / "appimages").glob("*.AppImage")) + details["appimages"] = [str(path) for path in appimages] + details["installable"] = platform.system().lower() == "linux" + if not details["installable"]: + details["blocked_reason"] = "Audacity and LMMS AppImage setup targets Linux cloud desktops." elif pack.install_kind == "blender_app": binary = _blender_binary() installed = binary.exists() and os.access(binary, os.X_OK) details["binary"] = str(binary) details["app_dir"] = str(_blender_app_dir()) details["version"] = BLENDER_VERSION + elif pack.install_kind == "godot_app": + binary = _godot_binary_from_state() + installed = binary is not None and marker is not None + details["binary"] = str(binary) if binary else "" + details["app_dir"] = str(_godot_root()) + details["release_api"] = GODOT_RELEASE_API return { "id": pack.id, @@ -775,6 +1453,7 @@ async def _run_command( label: str, cwd: Path | None = None, env: dict[str, str] | None = None, + timeout: float | None = None, ) -> AsyncIterator[dict[str, Any]]: yield {"event": "step", "status": "running", "message": label, "command": cmd} process = await asyncio.create_subprocess_exec( @@ -799,7 +1478,13 @@ async def _run_command( process.kill() await process.wait() raise - return_code = await process.wait() + try: + return_code = await asyncio.wait_for(process.wait(), timeout=timeout) + except TimeoutError: + process.kill() + await process.wait() + yield {"event": "error", "status": "failed", "message": f"{label} timed out"} + raise RuntimeError(f"{label} timed out") if return_code != 0: yield { "event": "error", @@ -823,6 +1508,31 @@ def _write_marker(pack: StudioPack, extra: dict[str, Any] | None = None) -> None if extra: marker.update(extra) _marker_path(pack.id).write_text(json.dumps(marker, indent=2), encoding="utf-8") + try: + from nvh.integrations.receipts import write_receipt + + launcher_paths = [str(_local_bin() / launcher) for launcher in pack.launchers] + version = str(marker.get("version")) if marker.get("version") else None + write_receipt( + kind="studio-pack", + item_id=pack.id, + title=pack.title, + install_path=root, + version=version, + source_urls=pack.source_urls, + launchers=launcher_paths, + models=pack.models, + files=[str(_marker_path(pack.id))], + metadata={ + "category": pack.category, + "install_kind": pack.install_kind, + "recommended_vram_gb": pack.recommended_vram_gb, + "estimated_disk_gb": pack.estimated_disk_gb, + "marker": marker, + }, + ) + except Exception: + pass def _write_script(path: Path, content: str) -> None: @@ -1040,16 +1750,55 @@ def _write_game_lab(pack: StudioPack) -> None: _write_script(launcher, content) -async def _install_python_venv(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: +def _write_music_lab(pack: StudioPack) -> None: root = _pack_root(pack.id) - venv_python = _venv_python(pack.id) - root.mkdir(parents=True, exist_ok=True) - env = os.environ.copy() - env.update(storage_layout().env()) - env["PYTHONUTF8"] = "1" + for folder in ["inputs", "outputs", "stems", "transcripts", "notebooks"]: + (root / folder).mkdir(parents=True, exist_ok=True) - if force_update and venv_python.exists(): - yield {"event": "step", "status": "running", "message": "Updating existing Python environment"} + sample = root / "notebooks" / "README.md" + sample.write_text( + """# Music Producer AI Lab + +Drop source audio in `inputs/`, then use the launcher to start JupyterLab. + +Useful first experiments: + +- Split stems with Demucs +- Transcribe lyrics or vocals with WhisperX +- Generate short audio textures with Stable Audio tools +- Batch process files into `outputs/` + +Check licenses before publishing generated or transformed audio. +""", + encoding="utf-8", + ) + launcher = _local_bin() / "nvhive-music-lab" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +source "{root}/venv/bin/activate" +cd "{root}" +if [ "$#" -gt 0 ]; then + exec "$@" +fi +echo "NVHive Music Producer AI Lab" +echo "Inputs: {root}/inputs" +echo "Outputs: {root}/outputs" +exec jupyter lab --no-browser --ip 127.0.0.1 --port 8891 +""" + _write_script(launcher, content) + + +async def _install_python_venv(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: + root = _pack_root(pack.id) + venv_python = _venv_python(pack.id) + root.mkdir(parents=True, exist_ok=True) + env = os.environ.copy() + env.update(storage_layout().env()) + env["PYTHONUTF8"] = "1" + + if force_update and venv_python.exists(): + yield {"event": "step", "status": "running", "message": "Updating existing Python environment"} elif not venv_python.exists(): async for event in _run_command( [sys.executable, "-m", "venv", str(root / "venv")], @@ -1076,9 +1825,366 @@ async def _install_python_venv(pack: StudioPack, force_update: bool) -> AsyncIte _write_agent_launcher(pack) if pack.id == "game-dev-lab": _write_game_lab(pack) + if pack.id == "music-producer-lab": + _write_music_lab(pack) _write_marker(pack, {"packages": pack.python_packages, "venv": str(root / "venv")}) +def _write_openclaw_launcher() -> Path: + root = _pack_root("openclaw-agent") + workspace = _openclaw_workspace() + launcher = _local_bin() / "nvhive-openclaw" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +export NVH_HOME="${{NVH_HOME:-{storage_layout().home}}}" +export OPENCLAW_HOME="${{OPENCLAW_HOME:-{workspace}}}" +export NPM_CONFIG_PREFIX="${{NPM_CONFIG_PREFIX:-{_openclaw_prefix()}}}" +export PATH="{_openclaw_prefix()}/bin:{_local_bin()}:$PATH" +mkdir -p "$OPENCLAW_HOME" "{root}/logs" +cd "$OPENCLAW_HOME" +if [ "$#" -eq 0 ]; then + exec openclaw onboard --install-daemon +fi +exec openclaw "$@" +""" + _write_script(launcher, content) + return launcher + + +def _write_openclaw_readme() -> None: + root = _pack_root("openclaw-agent") + root.mkdir(parents=True, exist_ok=True) + (root / "README.md").write_text( + f"""# OpenClaw Agent Workspace + +OpenClaw is installed in this rootless nvHive pack: + +`{root}` + +Launch the guided OpenClaw onboarding: + +```bash +nvhive-openclaw +``` + +Advanced overrides: + +```bash +nvhive-openclaw --help +nvhive-openclaw tui +``` + +The wizard keeps OpenClaw state in `{_openclaw_workspace()}` and can route to +local Ollama models or configured cloud model providers. +""", + encoding="utf-8", + ) + + +async def _install_openclaw_agent(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: + if os.name == "nt": + yield {"event": "error", "status": "failed", "message": "OpenClaw rootless pack currently targets Linux/WSL sessions."} + return + + root = _pack_root(pack.id) + root.mkdir(parents=True, exist_ok=True) + _openclaw_workspace().mkdir(parents=True, exist_ok=True) + + if _openclaw_binary().exists() and not force_update: + launcher = _write_openclaw_launcher() + _write_openclaw_readme() + _write_marker(pack, {"binary": str(_openclaw_binary()), "launcher": str(launcher), "workspace": str(_openclaw_workspace())}) + yield {"event": "step", "status": "complete", "message": "OpenClaw already installed"} + return + + yield {"event": "step", "status": "running", "message": "Checking Node.js 22.16+ and npm 10+ for OpenClaw"} + try: + env, node_status = await asyncio.to_thread(_prepare_node_runtime) + except Exception as exc: + yield {"event": "error", "status": "failed", "message": str(exc)} + return + npm = shutil.which("npm", path=env.get("PATH")) + if not npm: + yield {"event": "error", "status": "failed", "message": "npm is unavailable after Node runtime setup."} + return + + _openclaw_prefix().mkdir(parents=True, exist_ok=True) + async for event in _run_command( + [npm, "install", "--prefix", str(_openclaw_prefix()), OPENCLAW_PACKAGE], + label="Install OpenClaw package", + env=env, + ): + yield event + + launcher = _write_openclaw_launcher() + _write_openclaw_readme() + _write_marker(pack, { + "binary": str(_openclaw_binary()), + "launcher": str(launcher), + "workspace": str(_openclaw_workspace()), + "node": node_status, + }) + yield { + "event": "complete", + "status": "complete", + "message": "OpenClaw installed. Launch nvhive-openclaw to onboard the agent.", + "launcher": str(launcher), + } + + +def _write_nemoclaw_launcher() -> Path: + root = _pack_root("nemoclaw-sandbox") + workspace = _nemoclaw_workspace() + launcher = _local_bin() / "nvhive-nemoclaw" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +export NVH_HOME="${{NVH_HOME:-{storage_layout().home}}}" +export NEMOCLAW_WORKSPACE="${{NEMOCLAW_WORKSPACE:-{workspace}}}" +export NPM_CONFIG_PREFIX="${{NPM_CONFIG_PREFIX:-{_nemoclaw_prefix()}}}" +export PATH="{_nemoclaw_prefix()}/bin:{_local_bin()}:$PATH" +mkdir -p "$NEMOCLAW_WORKSPACE" "{root}/logs" +cd "$NEMOCLAW_WORKSPACE" +if [ "$#" -eq 0 ]; then + exec nemoclaw onboard +fi +exec nemoclaw "$@" +""" + _write_script(launcher, content) + return launcher + + +def _write_nemoclaw_readme(docker: dict[str, Any]) -> None: + root = _pack_root("nemoclaw-sandbox") + root.mkdir(parents=True, exist_ok=True) + (root / "README.md").write_text( + f"""# NVIDIA NemoClaw Sandbox + +NemoClaw is the guarded OpenClaw path. It uses NVIDIA OpenShell and requires a +Docker runtime that works without sudo in this Linux session. + +Docker check: + +`{docker.get("detail", "not checked")}` + +Launch onboarding: + +```bash +nvhive-nemoclaw +``` + +Advanced overrides: + +```bash +nvhive-nemoclaw --help +nvhive-nemoclaw status +``` + +Keep the sandbox workspace on the persistent mount: + +`{_nemoclaw_workspace()}` +""", + encoding="utf-8", + ) + + +async def _install_nemoclaw_sandbox(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: + if os.name == "nt": + yield {"event": "error", "status": "failed", "message": "NemoClaw requires a Linux, macOS, or WSL2 container runtime; nvHive only enables this pack on Linux sessions."} + return + if platform.system() != "Linux": + yield {"event": "error", "status": "failed", "message": "This nvHive pack targets Linux cloud desktops."} + return + + docker = _docker_status() + if not docker["ready"]: + yield { + "event": "error", + "status": "failed", + "message": f"NemoClaw is blocked: {docker['detail']} {docker['rootless_hint']}", + "details": docker, + } + return + + root = _pack_root(pack.id) + root.mkdir(parents=True, exist_ok=True) + _nemoclaw_workspace().mkdir(parents=True, exist_ok=True) + + current_env = _node_env({"NPM_CONFIG_PREFIX": str(_nemoclaw_prefix())}) + existing_binary = _nemoclaw_binary_from_env(current_env) + if existing_binary and not force_update: + launcher = _write_nemoclaw_launcher() + _write_nemoclaw_readme(docker) + _write_marker(pack, {"binary": existing_binary, "launcher": str(launcher), "workspace": str(_nemoclaw_workspace()), "docker": docker}) + yield {"event": "step", "status": "complete", "message": "NemoClaw CLI already installed"} + return + + yield {"event": "step", "status": "running", "message": "Checking Node.js 22.16+ and npm 10+ for NemoClaw"} + try: + env, node_status = await asyncio.to_thread(_prepare_node_runtime) + except Exception as exc: + yield {"event": "error", "status": "failed", "message": str(exc)} + return + env = _node_env({"NPM_CONFIG_PREFIX": str(_nemoclaw_prefix())}) + npm = shutil.which("npm", path=env.get("PATH")) + if not npm: + yield {"event": "error", "status": "failed", "message": "npm is unavailable after Node runtime setup."} + return + + _nemoclaw_prefix().mkdir(parents=True, exist_ok=True) + async for event in _run_command( + [npm, "install", "--prefix", str(_nemoclaw_prefix()), NEMOCLAW_PACKAGE], + label="Install NemoClaw CLI", + env=env, + ): + yield event + + binary = _nemoclaw_binary_from_env(env) + launcher = _write_nemoclaw_launcher() + _write_nemoclaw_readme(docker) + _write_marker(pack, { + "binary": binary, + "launcher": str(launcher), + "workspace": str(_nemoclaw_workspace()), + "node": node_status, + "docker": docker, + "onboard_next": "nvhive-nemoclaw", + }) + yield { + "event": "complete", + "status": "complete", + "message": "NemoClaw CLI installed. Launch nvhive-nemoclaw to create the OpenShell sandbox.", + "launcher": str(launcher), + } + + +def _write_ace_step_launcher() -> Path: + root = _ace_step_root() + app_dir = _ace_step_app_dir() + uv_binary = _ace_step_uv_binary() + launcher = _local_bin() / "nvhive-ace-step" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +export NVH_HOME="${{NVH_HOME:-{storage_layout().home}}}" +export HF_HOME="${{HF_HOME:-{storage_layout().models_dir / "huggingface"}}}" +export TRANSFORMERS_CACHE="${{TRANSFORMERS_CACHE:-$HF_HOME/transformers}}" +export XDG_CACHE_HOME="${{XDG_CACHE_HOME:-{storage_layout().cache_dir}}}" +cd "{app_dir}" +if [ "$#" -gt 0 ]; then + exec "{uv_binary}" run "$@" +fi +exec "{uv_binary}" run acestep --server-name 127.0.0.1 --port 7865 +""" + _write_script(launcher, content) + return launcher + + +def _write_ace_step_readme() -> None: + root = _ace_step_root() + root.mkdir(parents=True, exist_ok=True) + (root / "README.md").write_text( + f"""# ACE-Step Music Generator + +ACE-Step 1.5 is installed under persistent nvHive storage: + +`{_ace_step_app_dir()}` + +Launch the local music studio: + +```bash +nvhive-ace-step +``` + +Then open http://127.0.0.1:7865. + +Models download on first launch and are kept on the persistent mount through +`HF_HOME`. For lower-VRAM GPUs, use ACE-Step's built-in lighter model/offload +options in the UI. +""", + encoding="utf-8", + ) + + +async def _install_ace_step_music(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: + if platform.system().lower() != "linux": + yield {"event": "error", "status": "failed", "message": "ACE-Step music pack targets Linux cloud desktops."} + return + git = shutil.which("git") + if not git: + yield {"event": "error", "status": "failed", "message": "Git is required to install ACE-Step."} + return + + root = _ace_step_root() + root.mkdir(parents=True, exist_ok=True) + app_dir = _ace_step_app_dir() + env = os.environ.copy() + env.update(storage_layout().env()) + env["PYTHONUTF8"] = "1" + env.setdefault("HF_HOME", str(storage_layout().models_dir / "huggingface")) + env.setdefault("XDG_CACHE_HOME", str(storage_layout().cache_dir)) + + if not app_dir.exists(): + async for event in _run_command( + [git, "clone", "--depth", "1", ACE_STEP_REPO_URL, str(app_dir)], + env=env, + label="Clone ACE-Step 1.5", + ): + yield event + elif force_update: + async for event in _run_command( + [git, "-C", str(app_dir), "pull", "--ff-only"], + env=env, + label="Update ACE-Step 1.5", + ): + yield event + else: + yield {"event": "step", "status": "complete", "message": "ACE-Step repository already present"} + + uv_python = _ace_step_uv_venv_python() + if not uv_python.exists(): + async for event in _run_command( + [sys.executable, "-m", "venv", str(root / "uv-venv")], + env=env, + label="Create ACE-Step uv environment", + ): + yield event + + async for event in _run_command( + [str(uv_python), "-m", "pip", "install", "--upgrade", "pip", "wheel", "setuptools", "uv"], + env=env, + label="Install rootless uv for ACE-Step", + ): + yield event + + uv_binary = _ace_step_uv_binary() + async for event in _run_command( + [str(uv_binary), "sync"], + cwd=app_dir, + env=env, + label="Install ACE-Step dependencies", + timeout=1800.0, + ): + yield event + + launcher = _write_ace_step_launcher() + _write_ace_step_readme() + _write_marker(pack, { + "repo": ACE_STEP_REPO_URL, + "app_dir": str(app_dir), + "uv": str(uv_binary), + "launcher": str(launcher), + "models_home": env["HF_HOME"], + }) + yield { + "event": "complete", + "status": "complete", + "message": "ACE-Step music generator installed. Launch nvhive-ace-step.", + "launcher": str(launcher), + } + + async def _install_comfy_nodes(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: app_dir = _comfyui_app_dir() venv_python = _comfyui_venv_python() @@ -1167,8 +2273,368 @@ def _write_mod_helper(pack: StudioPack) -> None: _write_script(launcher, content) +def _write_github_login_helper(pack: StudioPack) -> None: + root = _pack_root(pack.id) + for folder in ["repos", "tokens", "notes"]: + (root / folder).mkdir(parents=True, exist_ok=True) + (root / "README.md").write_text( + f"""# {pack.title} + +This helper keeps GitHub setup rootless and persistent. + +Preferred path: + +1. Run `nvhive-github-login`. +2. If GitHub CLI is available, use browser login. +3. If GitHub CLI is not available, add a fine-grained token as `GITHUB_TOKEN` + in your nvHive environment file and relaunch the WebUI. + +Public repositories can clone over HTTPS without login. Private repositories, +pull requests, and Unreal Engine source access need authenticated GitHub. +""", + encoding="utf-8", + ) + launcher = _local_bin() / "nvhive-github-login" + token_hint = storage_layout().config_dir / "env" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +echo "nvHive GitHub Connect" +echo "Workspace: {root}" + +if [ -n "${{GITHUB_TOKEN:-}}" ]; then + echo "GITHUB_TOKEN is present in this shell. GitHub API and private HTTPS clones can use it." +fi + +if command -v gh >/dev/null 2>&1; then + if gh auth status >/dev/null 2>&1; then + echo "GitHub CLI is already authenticated." + gh auth status + exit 0 + fi + echo "Starting GitHub browser login with GitHub CLI..." + gh auth login --web --git-protocol https + gh auth setup-git || true + gh auth status + exit 0 +fi + +cat <<'EOF' +GitHub CLI is not installed on this image. + +Rootless fallback: +1. Create a fine-grained GitHub token in the browser. +2. Add this line to the nvHive env file: + export GITHUB_TOKEN=your_token_here +3. Relaunch nvHive. + +Env file: +EOF +echo "{token_hint}" +""" + _write_script(launcher, content) + + +def _write_game_engine_helper(pack: StudioPack) -> None: + root = _pack_root(pack.id) + for folder in ["projects", "downloads", "notes", "assets"]: + (root / folder).mkdir(parents=True, exist_ok=True) + + if pack.id == "unity-hub-helper": + body = """# Unity Hub Helper + +Unity requires a Unity account and license acceptance. nvHive prepares the +persistent storage layout, then students can keep Unity editors and projects on +the block volume instead of the read-only OS disk. + +Suggested paths: + +- Projects: `projects/` +- Downloads: `downloads/` +- Shared AI assets: `assets/` + +Open https://unity.com/download if the provider image does not already include +Unity Hub. +""" + launcher_name = "nvhive-unity-hub" + launcher_message = "Unity Hub requires account sign-in. Use this workspace for downloads and projects." + else: + body = """# Unreal Engine Helper + +Unreal Engine is large and account-gated. nvHive prepares persistent storage, +GitHub/Epic notes, and asset folders so the setup can survive cloud session +rebuilds. + +Checklist: + +1. Connect GitHub with `nvhive-github-login`. +2. Link Epic and GitHub accounts for Unreal source access. +3. Keep source trees, derived data cache, and projects on the block volume. + +Unreal source/editor installs can exceed 150 GB. +""" + launcher_name = "nvhive-unreal-helper" + launcher_message = "Unreal setup needs Epic/GitHub access and plenty of persistent storage." + + (root / "README.md").write_text(body, encoding="utf-8") + launcher = _local_bin() / launcher_name + content = f"""#!/usr/bin/env bash +set -euo pipefail + +cd "{root}" +echo "{launcher_message}" +echo "Workspace: {root}" +find . -maxdepth 2 -type d | sort +""" + _write_script(launcher, content) + + +def _write_music_daw_helper(pack: StudioPack) -> None: + root = _pack_root(pack.id) + for folder in ["projects", "appimages", "plugins", "samples", "exports", "notes"]: + (root / folder).mkdir(parents=True, exist_ok=True) + readme = root / "README.md" + readme.write_text( + f"""# {pack.title} + +This is the persistent desktop-audio workspace for nvHive. + +nvHive attempts to download these rootless apps during install: + +- Audacity AppImage for waveform editing, vocal cleanup, and quick exports +- LMMS AppImage for beat making and MIDI sketches + +Manual optional additions: + +- REAPER Linux tarball for a compact professional DAW trial path +- MuseScore AppImage for notation and sheet music + +Downloaded or manually added apps live in `appimages/`. The launcher lists them +and can run one by name without writing to the base OS. If FUSE is unavailable +on a locked-down VM, the launcher uses AppImage extract-and-run. + +AI tools live beside this pack: + +- `nvhive-ace-step` for full local AI music generation +- `nvhive-music-lab` for stems, transcription, and batch processing + +Always check model and sample licenses before publishing music. +""", + encoding="utf-8", + ) + launcher = _local_bin() / "nvhive-music-studio" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +cd "{root}" +mkdir -p appimages projects plugins samples exports notes +if [ "$#" -eq 0 ]; then + echo "NVHive Music Producer Studio" + echo "Workspace: {root}" + echo + echo "Audacity and LMMS AppImages are downloaded during install when official assets are available." + echo "You can also drop REAPER or MuseScore AppImages/tarballs into appimages/." + echo "Run: nvhive-music-studio " + echo + find appimages -maxdepth 1 -type f | sort || true + exit 0 +fi + +target="$(find appimages -maxdepth 1 -type f -iname "*$1*" | head -n 1)" +if [ -z "$target" ]; then + echo "No matching AppImage/tarball found in {root}/appimages" + exit 1 +fi +chmod +x "$target" || true +if [[ "$target" == *.AppImage ]]; then + export APPIMAGE_EXTRACT_AND_RUN=1 +fi +exec "$target" +""" + _write_script(launcher, content) + + +def _write_omni_agent_helper(pack: StudioPack) -> None: + root = _pack_root(pack.id) + root.mkdir(parents=True, exist_ok=True) + readme = f"""# {pack.title} + +{pack.description} + +## Default Path + +Use NVIDIA NIM / build.nvidia.com first. This keeps AI Starter fast, rootless, +and usable on smaller student VMs while still exposing the new multimodal +Nemotron 3 Nano Omni workflow. + +## Local Path + +Only try a local download when nvWizard reports enough persistent storage and +GPU headroom. The current published footprints are approximately: + +- BF16: 61.5 GB +- FP8: 32.8 GB +- NVFP4: 20.9 GB + +The local path should be treated as an advanced option for large NVIDIA GPUs or +cloud instances with ample block storage. + +## Use Cases + +- Document intelligence and OCR +- Screenshot / GUI reasoning +- Audio-video reasoning +- Multimodal agent perception before OpenClaw or NemoClaw actions + +## Sources + +- {NVIDIA_OMNI_BLOG_URL} +- {NVIDIA_OMNI_TECH_BLOG_URL} +- {NVIDIA_OMNI_HF_URL} +- {NVIDIA_BUILD_URL} +""" + (root / "README.md").write_text(readme, encoding="utf-8") + plan = { + "name": "nvidia-omni-agent", + "default_path": "nvidia-nim", + "local_guardrails": { + "min_free_gb": 70, + "recommended_vram_gb": pack.recommended_vram_gb, + "model_sizes_gb": {"BF16": 61.5, "FP8": 32.8, "NVFP4": 20.9}, + }, + "models": pack.models, + "sources": pack.source_urls, + } + (root / "omni-agent-plan.json").write_text(json.dumps(plan, indent=2), encoding="utf-8") + launcher = _local_bin() / "nvhive-omni-agent" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +ROOT="{root}" +echo "NVHive NVIDIA Omni Agent" +echo "Workspace: $ROOT" +echo +echo "Default: use NVIDIA NIM / build.nvidia.com for Nemotron 3 Nano Omni." +echo "Local weights are advanced and gated by nvWizard storage + GPU checks." +echo +echo "Open: $ROOT/README.md" +""" + _write_script(launcher, content) + + +def _write_scaffold_pack(pack: StudioPack) -> None: + if pack.id == "game-mod-helper": + _write_mod_helper(pack) + elif pack.id == "github-login-helper": + _write_github_login_helper(pack) + elif pack.id in {"unity-hub-helper", "unreal-engine-helper"}: + _write_game_engine_helper(pack) + elif pack.id == "music-daw-helper": + _write_music_daw_helper(pack) + elif pack.id == "nvidia-omni-agent": + _write_omni_agent_helper(pack) + else: + root = _pack_root(pack.id) + root.mkdir(parents=True, exist_ok=True) + (root / "README.md").write_text( + f"# {pack.title}\n\n{pack.description}\n", + encoding="utf-8", + ) + + +async def _download_appimage_asset( + client: Any, + *, + api_url: str, + app_name: str, + downloads: Path, + force_update: bool, + required_tokens: tuple[str, ...] = (), + preferred_tokens: tuple[str, ...] = (), +) -> tuple[Path, dict[str, Any]]: + release_response = await client.get(api_url) + release_response.raise_for_status() + release = release_response.json() + asset = _select_appimage_asset( + release, + app_name=app_name, + required_tokens=required_tokens, + preferred_tokens=preferred_tokens, + ) + asset_name = str(asset["name"]) + asset_url = str(asset["browser_download_url"]) + release_tag = str(release.get("tag_name") or "latest") + target = downloads / asset_name + if target.exists() and not force_update: + target.chmod(target.stat().st_mode | stat.S_IXUSR) + return target, {"asset": asset_name, "version": release_tag, "url": asset_url, "cached": True} + + async with client.stream("GET", asset_url) as response: + response.raise_for_status() + with target.open("wb") as handle: + async for chunk in response.aiter_bytes(): + if chunk: + handle.write(chunk) + target.chmod(target.stat().st_mode | stat.S_IXUSR) + return target, {"asset": asset_name, "version": release_tag, "url": asset_url, "cached": False} + + +async def _install_music_daw_helper(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: + if platform.system().lower() != "linux": + yield {"event": "error", "status": "failed", "message": "Music DAW AppImage setup targets Linux cloud desktops."} + return + + _write_music_daw_helper(pack) + root = _pack_root(pack.id) + downloads = root / "appimages" + downloads.mkdir(parents=True, exist_ok=True) + downloaded: list[dict[str, Any]] = [] + download_errors: list[str] = [] + + try: + import httpx + except Exception as exc: + yield {"event": "error", "status": "failed", "message": f"Could not import httpx for AppImage downloads: {exc}"} + return + + async with httpx.AsyncClient(follow_redirects=True, timeout=600) as client: + for app_name, api_url, required, preferred in [ + ("Audacity", AUDACITY_RELEASE_API, ("linux",), ("22.04", "x64")), + ("LMMS", LMMS_RELEASE_API, tuple(), ("linux", "x86_64", "x64")), + ]: + try: + yield {"event": "step", "status": "running", "message": f"Checking latest official {app_name} AppImage"} + target, metadata = await _download_appimage_asset( + client, + api_url=api_url, + app_name=app_name, + downloads=downloads, + force_update=force_update, + required_tokens=required, + preferred_tokens=preferred, + ) + downloaded.append({"name": app_name, "path": str(target), **metadata}) + verb = "Using cached" if metadata.get("cached") else "Downloaded" + yield {"event": "step", "status": "complete", "message": f"{verb} {app_name} AppImage", "path": str(target)} + except Exception as exc: + message = f"{app_name} AppImage could not be auto-downloaded: {exc}" + download_errors.append(message) + yield {"event": "warning", "status": "warning", "message": message} + + if not downloaded: + yield {"event": "error", "status": "failed", "message": "No Audacity or LMMS AppImages were downloaded; retry later or use manual AppImage overrides."} + return + + _write_marker(pack, {"workspace": str(root), "appimages": downloaded, "download_errors": download_errors, "force_update": force_update}) + yield {"event": "step", "status": "complete", "message": f"{pack.title} workspace ready"} + + async def _install_scaffold(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: - _write_mod_helper(pack) + if pack.id == "music-daw-helper": + async for event in _install_music_daw_helper(pack, force_update): + yield event + return + _write_scaffold_pack(pack) _write_marker(pack, {"workspace": str(_pack_root(pack.id)), "force_update": force_update}) yield {"event": "step", "status": "complete", "message": f"{pack.title} workspace ready"} @@ -1187,6 +2653,110 @@ def _safe_extract_tar(archive: Path, target: Path) -> None: tar.extractall(target, members=members) +def _safe_extract_zip(archive: Path, target: Path) -> None: + """Extract a zip archive while refusing path traversal entries.""" + target.mkdir(parents=True, exist_ok=True) + target_resolved = target.resolve() + with zipfile.ZipFile(archive) as zf: + for member in zf.infolist(): + destination = (target / member.filename).resolve() + try: + destination.relative_to(target_resolved) + except ValueError as exc: + raise RuntimeError(f"Archive member escapes target directory: {member.filename}") from exc + zf.extractall(target) + + +def _select_godot_asset(release: dict[str, Any]) -> dict[str, Any]: + assets = release.get("assets") + if not isinstance(assets, list): + raise RuntimeError("Godot release metadata did not include assets") + + for asset in assets: + if not isinstance(asset, dict): + continue + name = str(asset.get("name") or "") + lower_name = name.lower() + url = str(asset.get("browser_download_url") or "") + if ( + url + and lower_name.endswith(".zip") + and "linux" in lower_name + and "x86_64" in lower_name + and "mono" not in lower_name + and "server" not in lower_name + and "template" not in lower_name + ): + return asset + raise RuntimeError("No official Godot Linux x86_64 zip asset was found in the latest release") + + +def _select_appimage_asset( + release: dict[str, Any], + *, + app_name: str, + required_tokens: tuple[str, ...] = (), + preferred_tokens: tuple[str, ...] = (), +) -> dict[str, Any]: + assets = release.get("assets") + if not isinstance(assets, list): + raise RuntimeError(f"{app_name} release metadata did not include assets") + + candidates: list[tuple[int, dict[str, Any]]] = [] + for asset in assets: + if not isinstance(asset, dict): + continue + name = str(asset.get("name") or "") + lower_name = name.lower() + url = str(asset.get("browser_download_url") or "") + if not url or not lower_name.endswith(".appimage"): + continue + if any(token not in lower_name for token in required_tokens): + continue + if any(token in lower_name for token in ("aarch64", "arm64", "armv7")): + continue + score = sum(10 for token in preferred_tokens if token in lower_name) + score += sum(1 for token in ("x64", "x86_64", "amd64", "linux") if token in lower_name) + candidates.append((score, asset)) + + if not candidates: + raise RuntimeError(f"No Linux x64 AppImage asset was found in the latest {app_name} release") + return sorted(candidates, key=lambda item: item[0], reverse=True)[0][1] + + +def _find_godot_binary(root: Path) -> Path | None: + candidates: list[Path] = [] + for path in root.rglob("*"): + if not path.is_file(): + continue + name = path.name.lower() + if name.startswith("godot") and "linux" in name and "x86_64" in name and not name.endswith(".zip"): + candidates.append(path) + if candidates: + return sorted(candidates, key=lambda item: len(str(item)))[0] + return None + + +def _write_godot_launcher(binary: Path) -> Path: + layout = storage_layout() + root = _godot_root() + projects = root / "projects" + settings = layout.config_dir / "godot" + projects.mkdir(parents=True, exist_ok=True) + settings.mkdir(parents=True, exist_ok=True) + launcher = _local_bin() / "nvhive-godot" + content = f"""#!/usr/bin/env bash +set -euo pipefail + +export GODOT_EDITOR_SETTINGS_DIR="${{GODOT_EDITOR_SETTINGS_DIR:-{settings}}}" +mkdir -p "$GODOT_EDITOR_SETTINGS_DIR" "{projects}" +cd "{projects}" +exec "{binary}" "$@" +""" + _write_script(launcher, content) + return launcher + + def _write_blender_launcher() -> Path: layout = storage_layout() binary = _blender_binary() @@ -1208,6 +2778,152 @@ def _write_blender_launcher() -> Path: return launcher +def _write_model_receipt(model: StudioModel) -> None: + try: + from nvh.integrations.receipts import write_receipt + + layout = storage_layout() + write_receipt( + kind="studio-model", + item_id=model.id, + title=model.title, + install_path=layout.ollama_models_dir, + source_urls=[model.source_url], + models=[model.install_target], + metadata={ + "provider": model.provider, + "install_target": model.install_target, + "category": model.category, + "recommended_vram_gb": model.recommended_vram_gb, + "estimated_disk_gb": model.estimated_disk_gb, + "capabilities": model.capabilities, + "license_note": model.license_note, + }, + ) + except Exception: + pass + + +async def _install_godot_app(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: + if platform.system().lower() != "linux" or platform.machine().lower() not in {"x86_64", "amd64"}: + yield { + "event": "error", + "status": "failed", + "message": "The Godot rootless pack currently supports Linux x86_64 desktops.", + } + return + + root = _godot_root() + downloads = root / "downloads" + downloads.mkdir(parents=True, exist_ok=True) + + existing = _godot_binary_from_state() + if existing and not force_update: + launcher = _write_godot_launcher(existing) + _write_marker(pack, {"binary": str(existing), "launcher": str(launcher), "force_update": force_update}) + yield {"event": "step", "status": "complete", "message": "Godot already installed"} + return + + yield {"event": "step", "status": "running", "message": "Checking latest official Godot release"} + import httpx + + async with httpx.AsyncClient(follow_redirects=True, timeout=600) as client: + release_response = await client.get(GODOT_RELEASE_API) + release_response.raise_for_status() + release = release_response.json() + asset = _select_godot_asset(release) + asset_name = str(asset["name"]) + asset_url = str(asset["browser_download_url"]) + release_tag = str(release.get("tag_name") or "latest") + safe_tag = re.sub(r"[^A-Za-z0-9._-]+", "-", release_tag).strip("-") or "latest" + app_dir = root / safe_tag + + if app_dir.exists() and not force_update: + binary = _find_godot_binary(app_dir) + if binary: + binary.chmod(binary.stat().st_mode | stat.S_IXUSR) + launcher = _write_godot_launcher(binary) + _godot_current_file().write_text( + json.dumps({"version": release_tag, "binary": str(binary), "app_dir": str(app_dir)}, indent=2), + encoding="utf-8", + ) + _write_marker(pack, {"binary": str(binary), "launcher": str(launcher), "version": release_tag}) + yield {"event": "step", "status": "complete", "message": f"Godot {release_tag} already installed"} + return + + if app_dir.exists(): + shutil.rmtree(app_dir) + + archive = downloads / asset_name + yield {"event": "step", "status": "running", "message": f"Downloading Godot {release_tag}", "url": asset_url} + async with client.stream("GET", asset_url) as response: + response.raise_for_status() + with archive.open("wb") as fh: + async for chunk in response.aiter_bytes(): + if chunk: + fh.write(chunk) + + stage = Path(tempfile.mkdtemp(prefix="godot-", dir=str(root))) + try: + yield {"event": "step", "status": "running", "message": "Extracting Godot archive"} + _safe_extract_zip(archive, stage) + binary = _find_godot_binary(stage) + if not binary: + raise RuntimeError("Godot archive did not contain the expected Linux executable") + app_dir.parent.mkdir(parents=True, exist_ok=True) + shutil.move(str(stage), str(app_dir)) + final_binary = app_dir / binary.relative_to(stage) + final_binary.chmod(final_binary.stat().st_mode | stat.S_IXUSR) + launcher = _write_godot_launcher(final_binary) + except Exception as exc: + if stage.exists(): + shutil.rmtree(stage, ignore_errors=True) + yield {"event": "error", "status": "failed", "message": f"Godot install failed: {exc}"} + return + + (root / "README.md").write_text( + f"""# Godot Engine + +Godot {release_tag} is installed without root access at: + +`{final_binary}` + +Launch it with: + +```bash +nvhive-godot +``` + +Projects live in `{root / "projects"}` so game prototypes, Blender exports, and +ComfyUI textures stay on persistent storage. +""", + encoding="utf-8", + ) + _godot_current_file().write_text( + json.dumps( + { + "version": release_tag, + "binary": str(final_binary), + "app_dir": str(app_dir), + "asset": asset_name, + "source_url": asset_url, + }, + indent=2, + ), + encoding="utf-8", + ) + _write_marker( + pack, + { + "binary": str(final_binary), + "launcher": str(launcher), + "version": release_tag, + "asset": asset_name, + }, + ) + yield {"event": "step", "status": "complete", "message": f"Godot {release_tag} installed"} + + async def _install_blender_app(pack: StudioPack, force_update: bool) -> AsyncIterator[dict[str, Any]]: if platform.system() != "Linux" or platform.machine().lower() not in {"x86_64", "amd64"}: yield { @@ -1365,6 +3081,7 @@ async def install_studio_models( model.install_target in installed or model.install_target.split(":")[0] in installed ): + _write_model_receipt(model) yield { "event": "model", "status": "complete", @@ -1378,6 +3095,7 @@ async def install_studio_models( env=_ollama_env(), ): yield {**event, "model_id": model.id} + _write_model_receipt(model) yield { "event": "complete", @@ -1435,12 +3153,24 @@ async def install_studio_packs( elif pack.install_kind == "python_venv": async for event in _install_python_venv(pack, force_update): yield {**event, "pack_id": pack.id} + elif pack.install_kind == "ace_step_music": + async for event in _install_ace_step_music(pack, force_update): + yield {**event, "pack_id": pack.id} + elif pack.install_kind == "openclaw_agent": + async for event in _install_openclaw_agent(pack, force_update): + yield {**event, "pack_id": pack.id} + elif pack.install_kind == "nemoclaw_sandbox": + async for event in _install_nemoclaw_sandbox(pack, force_update): + yield {**event, "pack_id": pack.id} elif pack.install_kind == "comfy_nodes": async for event in _install_comfy_nodes(pack, force_update): yield {**event, "pack_id": pack.id} elif pack.install_kind == "scaffold": async for event in _install_scaffold(pack, force_update): yield {**event, "pack_id": pack.id} + elif pack.install_kind == "godot_app": + async for event in _install_godot_app(pack, force_update): + yield {**event, "pack_id": pack.id} elif pack.install_kind == "blender_app": async for event in _install_blender_app(pack, force_update): yield {**event, "pack_id": pack.id} diff --git a/nvh/utils/gpu.py b/nvh/utils/gpu.py index efc6f58..2c878f9 100644 --- a/nvh/utils/gpu.py +++ b/nvh/utils/gpu.py @@ -16,8 +16,11 @@ from __future__ import annotations import re +import shutil import subprocess from dataclasses import dataclass, field +from pathlib import Path +from typing import Any @dataclass @@ -51,31 +54,105 @@ class ModelRecommendation: tier: str # "mini", "small", "full", "multi-gpu" +def _append_gpu_issue( + issues: list[dict[str, Any]] | None, + *, + source: str, + code: str, + message: str, + severity: str = "warning", + detail: str = "", +) -> None: + if issues is None: + return + issues.append( + { + "source": source, + "code": code, + "message": message, + "severity": severity, + "detail": detail[:300], + } + ) + + +def _nvidia_device_files_present() -> bool: + try: + dev = Path("/dev") + return (dev / "nvidiactl").exists() or any(dev.glob("nvidia[0-9]*")) + except Exception: + return False + + +def detect_gpu_status() -> dict[str, Any]: + """Detect NVIDIA GPUs and preserve rootless failure details for nvWizard.""" + issues: list[dict[str, Any]] = [] + gpus = _detect_gpus_pynvml(issues=issues) + source = "pynvml" if gpus else "" + if not gpus: + gpus = _detect_gpus_smi(issues=issues) + if gpus: + source = "nvidia-smi" + + device_files_present = _nvidia_device_files_present() + if gpus: + status = "ready" + elif device_files_present: + status = "blocked" + _append_gpu_issue( + issues, + source="linux-devices", + code="devices-present-no-query", + message="NVIDIA device files exist, but nvHive could not query the GPU.", + detail="The base image, driver permissions, or session policy may be blocking NVML and nvidia-smi.", + ) + elif shutil.which("nvidia-smi"): + status = "unavailable" + else: + status = "not-detected" + + return { + "status": status, + "source": source or "none", + "gpus": gpus, + "issues": issues, + "device_files_present": device_files_present, + "nvidia_smi": shutil.which("nvidia-smi") or "", + } + + def detect_gpus() -> list[GPUInfo]: """Detect NVIDIA GPUs. Tries pynvml first (fast, rich data), falls back to nvidia-smi. Returns a list of GPUInfo objects — one per GPU. Returns an empty list if no NVIDIA GPU is found or accessible. """ - # Try pynvml first (direct NVML library — faster and more data) - gpus = _detect_gpus_pynvml() - if gpus: - return gpus - - # Fall back to nvidia-smi subprocess - return _detect_gpus_smi() - + return detect_gpu_status()["gpus"] -def _detect_gpus_pynvml() -> list[GPUInfo]: +def _detect_gpus_pynvml(*, issues: list[dict[str, Any]] | None = None) -> list[GPUInfo]: """Detect GPUs via pynvml (NVML Python bindings).""" try: import pynvml except ImportError: + _append_gpu_issue( + issues, + source="pynvml", + code="module-missing", + message="Python NVML bindings are not installed.", + severity="info", + ) return [] # pynvml not installed — fall back to nvidia-smi try: pynvml.nvmlInit() - except Exception: + except Exception as exc: + _append_gpu_issue( + issues, + source="pynvml", + code="nvml-init-failed", + message="NVML could not initialize in this session.", + detail=str(exc), + ) return [] try: @@ -90,6 +167,15 @@ def _detect_gpus_pynvml() -> list[GPUInfo]: pass device_count = pynvml.nvmlDeviceGetCount() + if device_count == 0: + _append_gpu_issue( + issues, + source="pynvml", + code="no-devices", + message="NVML initialized but reported zero NVIDIA GPUs.", + severity="info", + ) + return [] gpus: list[GPUInfo] = [] for i in range(device_count): @@ -196,7 +282,14 @@ def _detect_gpus_pynvml() -> list[GPUInfo]: )) return gpus - except Exception: + except Exception as exc: + _append_gpu_issue( + issues, + source="pynvml", + code="nvml-query-failed", + message="NVML failed while reading GPU details.", + detail=str(exc), + ) return [] finally: try: @@ -205,12 +298,22 @@ def _detect_gpus_pynvml() -> list[GPUInfo]: pass -def _detect_gpus_smi() -> list[GPUInfo]: +def _detect_gpus_smi(*, issues: list[dict[str, Any]] | None = None) -> list[GPUInfo]: """Fallback: detect GPUs via nvidia-smi subprocess.""" + nvidia_smi = shutil.which("nvidia-smi") + if not nvidia_smi: + _append_gpu_issue( + issues, + source="nvidia-smi", + code="binary-missing", + message="nvidia-smi is not on PATH.", + severity="info", + ) + return [] try: result = subprocess.run( [ - "nvidia-smi", + nvidia_smi, "--query-gpu=index,name,memory.total,memory.used,memory.free," "utilization.gpu,driver_version", "--format=csv,noheader,nounits", @@ -219,18 +322,48 @@ def _detect_gpus_smi() -> list[GPUInfo]: text=True, timeout=10, ) - except FileNotFoundError: + except subprocess.TimeoutExpired: + _append_gpu_issue( + issues, + source="nvidia-smi", + code="timeout", + message="nvidia-smi timed out while querying GPUs.", + ) return [] - except Exception: + except Exception as exc: + _append_gpu_issue( + issues, + source="nvidia-smi", + code="command-error", + message="nvidia-smi could not run.", + detail=str(exc), + ) return [] - if result.returncode != 0 or not result.stdout.strip(): + if result.returncode != 0: + _append_gpu_issue( + issues, + source="nvidia-smi", + code="nonzero-exit", + message="nvidia-smi returned an error.", + detail=(result.stderr or result.stdout or "").strip(), + ) + return [] + if not result.stdout.strip(): + _append_gpu_issue( + issues, + source="nvidia-smi", + code="empty-output", + message="nvidia-smi returned no GPU rows.", + severity="info", + ) return [] cuda_ver = _get_cuda_version() + compute_caps = _get_compute_capabilities() gpus: list[GPUInfo] = [] - for line in result.stdout.strip().splitlines(): + for row_index, line in enumerate(result.stdout.strip().splitlines()): parts = [p.strip() for p in line.split(",")] if len(parts) < 7: continue @@ -255,6 +388,7 @@ def _detect_gpus_smi() -> list[GPUInfo]: memory_used_mb=memory_used, memory_free_mb=memory_free, index=index, + compute_capability=compute_caps[row_index] if row_index < len(compute_caps) else (0, 0), ) ) except (ValueError, IndexError): @@ -263,6 +397,35 @@ def _detect_gpus_smi() -> list[GPUInfo]: return gpus +def _parse_compute_capability_value(value: str) -> tuple[int, int]: + match = re.search(r"(\d+)(?:\.(\d+))?", value.strip()) + if not match: + return (0, 0) + return (int(match.group(1)), int(match.group(2) or 0)) + + +def _get_compute_capabilities() -> list[tuple[int, int]]: + nvidia_smi = shutil.which("nvidia-smi") + if not nvidia_smi: + return [] + try: + result = subprocess.run( + [nvidia_smi, "--query-gpu=compute_cap", "--format=csv,noheader"], + capture_output=True, + text=True, + timeout=5, + ) + except Exception: + return [] + if result.returncode != 0: + return [] + return [ + cc + for cc in (_parse_compute_capability_value(line) for line in result.stdout.splitlines()) + if cc != (0, 0) + ] + + def _get_cuda_version() -> str: """Return CUDA version string reported by nvidia-smi, or 'unknown'.""" try: @@ -588,7 +751,7 @@ def _recommend_vision_model( return None # Compute capability for arch-aware swap. Use primary GPU. - cc = _parse_compute_capability(gpus[0].name) if gpus else (0, 0) + cc = gpu_architecture_info(gpus[0])["compute_capability"] if gpus else (0, 0) turing_or_older = cc < (8, 0) and cc != (0, 0) if total_vram_gb < 12: @@ -762,6 +925,31 @@ class OllamaOptimization: notes: list[str] +def architecture_from_compute_capability(cc: tuple[int, int]) -> str: + if cc >= (10, 0): + return "Blackwell" + if cc >= (9, 0): + return "Hopper" + if cc >= (8, 9): + return "Ada Lovelace" + if cc >= (8, 0): + return "Ampere" + if cc >= (7, 5): + return "Turing" + return "Unknown" + + +def gpu_architecture_info(gpu: GPUInfo) -> dict[str, Any]: + observed = gpu.compute_capability != (0, 0) + cc = gpu.compute_capability if observed else _parse_compute_capability(gpu.name) + return { + "architecture": architecture_from_compute_capability(cc), + "compute_capability": cc, + "compute_capability_source": "nvml-or-smi" if observed else "name-heuristic", + "heuristic": not observed, + } + + def get_ollama_optimizations(gpus: list[GPUInfo] | None = None) -> OllamaOptimization: """Return architecture-aware Ollama settings based on detected GPU. @@ -788,13 +976,18 @@ def get_ollama_optimizations(gpus: list[GPUInfo] | None = None) -> OllamaOptimiz # Use primary GPU for architecture decisions gpu = gpus[0] total_vram_gb = sum(g.vram_mb for g in gpus) / 1024 - cc = _parse_compute_capability(gpu.name) + arch_info = gpu_architecture_info(gpu) + cc = arch_info["compute_capability"] notes: list[str] = [] + if arch_info["heuristic"]: + notes.append("Compute capability is name-based; confirm after driver/NVML access improves.") # Flash Attention: CC >= 8.0 (Ampere+) flash_attention = cc >= (8, 0) - if cc >= (9, 0): + if cc == (0, 0): + notes.append("Compute capability unknown - using conservative attention settings") + elif cc >= (9, 0): notes.append("Flash Attention 3 available (Hopper+)") elif cc >= (8, 0): notes.append("Flash Attention 2 enabled") @@ -802,7 +995,10 @@ def get_ollama_optimizations(gpus: list[GPUInfo] | None = None) -> OllamaOptimiz notes.append("Flash Attention not supported (Turing) — using standard attention") # Architecture name - if cc >= (10, 0): + if cc == (0, 0): + arch = "Unknown" + notes.append("Architecture could not be confirmed from NVML or the GPU name") + elif cc >= (10, 0): arch = "Blackwell" notes.append("FP4 Tensor Cores available (not yet used by Ollama)") notes.append("GDDR7 provides high memory bandwidth") @@ -901,7 +1097,7 @@ def _parse_compute_capability(gpu_name: str) -> tuple[int, int]: return (7, 5) # Older or unrecognized — assume Ampere as safe default - return (8, 0) + return (0, 0) def get_gpu_summary() -> str: diff --git a/nvh/utils/logging.py b/nvh/utils/logging.py index 68d5b49..98a2a48 100644 --- a/nvh/utils/logging.py +++ b/nvh/utils/logging.py @@ -1,9 +1,41 @@ -"""Structured JSON logging for production deployments.""" +"""Structured logging for production and rootless desktop deployments.""" +import contextvars import json import logging +import os import sys from datetime import UTC, datetime +from pathlib import Path + +_request_id_var: contextvars.ContextVar[str | None] = contextvars.ContextVar( + "nvh_request_id", + default=None, +) + + +def set_request_id(request_id: str | None) -> contextvars.Token[str | None]: + """Bind a request id to logs emitted by the current context.""" + return _request_id_var.set(request_id) + + +def reset_request_id(token: contextvars.Token[str | None]) -> None: + """Restore the previous request id context.""" + _request_id_var.reset(token) + + +def get_request_id() -> str | None: + """Return the request id bound to the current context, if any.""" + return _request_id_var.get() + + +class RequestContextFilter(logging.Filter): + """Attach request context fields so formatters never KeyError.""" + + def filter(self, record: logging.LogRecord) -> bool: + if not hasattr(record, "request_id"): + record.request_id = get_request_id() or "" + return True class JSONFormatter(logging.Formatter): @@ -17,29 +49,79 @@ def format(self, record: logging.LogRecord) -> str: if record.exc_info and record.exc_info[0]: log_entry["exception"] = self.formatException(record.exc_info) # Add extra fields - for key in ("request_id", "provider", "model", "latency_ms", "tokens", "cost"): + for key in ( + "request_id", + "error_id", + "method", + "path", + "status_code", + "duration_ms", + "client", + "job_id", + "kind", + "provider", + "model", + "latency_ms", + "tokens", + "cost", + ): if hasattr(record, key): - log_entry[key] = getattr(record, key) + value = getattr(record, key) + if value not in ("", None): + log_entry[key] = value return json.dumps(log_entry) +def _log_file_from_env() -> Path | None: + explicit = os.environ.get("HIVE_LOG_FILE") + if explicit: + return Path(explicit).expanduser() + logs_dir = os.environ.get("NVH_LOGS") + if logs_dir: + return Path(logs_dir).expanduser() / "nvhive.log" + home = os.environ.get("NVH_HOME") + if home: + return Path(home).expanduser() / "logs" / "nvhive.log" + return None + + +def _make_handler(json_format: bool, *, stream: bool) -> logging.Handler: + handler: logging.Handler + if stream: + handler = logging.StreamHandler(sys.stdout) + else: + log_file = _log_file_from_env() + if log_file is None: + raise RuntimeError("No log file configured") + log_file.parent.mkdir(parents=True, exist_ok=True) + handler = logging.FileHandler(log_file, encoding="utf-8") + + handler.addFilter(RequestContextFilter()) + if json_format: + handler.setFormatter(JSONFormatter()) + else: + handler.setFormatter( + logging.Formatter( + "%(asctime)s [%(levelname)s] %(name)s [%(request_id)s]: %(message)s" + ) + ) + return handler + + def setup_logging(level: str = "INFO", json_format: bool = False) -> logging.Logger: """Configure application logging.""" root = logging.getLogger("nvh") root.setLevel(getattr(logging, level.upper(), logging.INFO)) + root.propagate = False - handler = logging.StreamHandler(sys.stdout) - if json_format: - handler.setFormatter(JSONFormatter()) - else: - handler.setFormatter(logging.Formatter( - "%(asctime)s [%(levelname)s] %(name)s: %(message)s" - )) + handlers = [_make_handler(json_format, stream=True)] + try: + handlers.append(_make_handler(json_format, stream=False)) + except Exception: + # File logging is best-effort until NVH_HOME/NVH_LOGS has been activated. + pass # Avoid adding duplicate handlers on repeated calls - if not root.handlers: - root.addHandler(handler) - else: - root.handlers[0] = handler + root.handlers = handlers return root diff --git a/pyproject.toml b/pyproject.toml index cf716da..82d2d06 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "nvhive" -version = "0.34.0" +version = "0.34.1" description = "NVHive — Multi-LLM orchestration platform with intelligent routing, hive consensus, and auto-agent generation" readme = "README.md" requires-python = ">=3.11" @@ -101,7 +101,7 @@ nvhive-mcp = "nvh.mcp_server:main" include = ["nvh*"] [tool.setuptools.package-data] -nvh = ["config/*.yaml", "workflows/*.yaml"] +nvh = ["catalog/*.json", "config/*.yaml", "workflows/*.yaml"] [tool.ruff] target-version = "py311" diff --git a/start-linux.sh b/start-linux.sh new file mode 100644 index 0000000..ad8b368 --- /dev/null +++ b/start-linux.sh @@ -0,0 +1,168 @@ +#!/usr/bin/env bash +# NVHive Linux GPU desktop launcher. +# Rootless, mount-aware, and safe to run from a GitHub download link. + +set -euo pipefail + +G='\033[0;32m'; Y='\033[1;33m'; D='\033[0;90m'; R='\033[0;31m'; N='\033[0m' +REPO_RAW="${NVH_REPO_RAW:-https://raw.githubusercontent.com/thatcooperguy/nvHive/main}" +LINUX_BINARY_URL="${NVH_BINARY_URL:-https://github.com/thatcooperguy/nvHive/releases/latest/download/nvh-linux-x86_64}" + +free_gb_for_path() { + df -Pk "$1" 2>/dev/null | awk 'NR==2 {printf "%d", $4 / 1048576}' +} + +score_candidate() { + local base="${1%/}" + [ -d "$base" ] || return 1 + [ -w "$base" ] || return 1 + local name home free_gb score + name="$(basename "$base")" + home="$base/nvhive" + case "$name" in + nvh|nvhive|.nvh) home="$base" ;; + esac + score=0 + case "$base" in + "$HOME"|"$HOME/"*) score=$((score - 15)) ;; + /mnt/*|/media/*|/workspace*|/data*|/persistent*|/storage*) score=$((score + 45)) ;; + esac + case "$base" in + *persist*|*Persist*|*workspace*|*Workspace*|*project*|*Project*|*data*|*Data*) score=$((score + 20)) ;; + *tmp*|*cache*|*Cache*) score=$((score - 40)) ;; + esac + free_gb="$(free_gb_for_path "$base")" + free_gb="${free_gb:-0}" + if [ "$free_gb" -ge 100 ]; then + score=$((score + 35)) + elif [ "$free_gb" -ge 50 ]; then + score=$((score + 30)) + elif [ "$free_gb" -ge 20 ]; then + score=$((score + 20)) + elif [ "$free_gb" -lt 10 ]; then + score=$((score - 15)) + fi + if [ -f "$home/nvh-env.sh" ] || [ -d "$home/repo" ] || [ -d "$home/models" ]; then + score=$((score + 30)) + fi + printf '%s|%s\n' "$score" "$home" +} + +detect_home() { + local roots=() + local env_name env_value root child scored score home best_score best_home + if [ -d "$HOME/nvh/repo" ] && [ ! -d "$HOME/.nvh/repo" ]; then + printf '%s\n' "$HOME/nvh" + return 0 + fi + for env_name in NVH_MOUNT PERSISTENT_HOME PERSISTENT_DIR PERSISTENT_STORAGE WORKSPACE PROJECTS PROJECT_HOME DATA_DIR; do + env_value="${!env_name:-}" + [ -n "$env_value" ] && roots+=("$env_value") + done + roots+=("/mnt" "/media/${USER:-}" "/workspace" "/data" "/persistent" "/storage") + best_score=-999 + best_home="" + for root in "${roots[@]}"; do + [ -n "$root" ] || continue + if scored="$(score_candidate "$root")"; then + score="${scored%%|*}" + home="${scored#*|}" + if [ "$score" -gt "$best_score" ]; then + best_score="$score" + best_home="$home" + fi + fi + [ -d "$root" ] || continue + for child in "$root"/*; do + [ -d "$child" ] || continue + if scored="$(score_candidate "$child")"; then + score="${scored%%|*}" + home="${scored#*|}" + if [ "$score" -gt "$best_score" ]; then + best_score="$score" + best_home="$home" + fi + fi + done + done + if [ -n "$best_home" ] && [ "$best_score" -ge 55 ]; then + printf '%s\n' "$best_home" + return 0 + fi + return 1 +} + +find_python() { + for py in python3.12 python3.11 python3.10 python3; do + if command -v "$py" >/dev/null 2>&1; then + echo "$py" + return 0 + fi + done + return 1 +} + +install_binary() { + mkdir -p "$NVH_HOME/bin" + echo -e "${Y}Python was not found, so nvHive is using the single-file Linux binary.${N}" + echo -e "${D}Downloading: $LINUX_BINARY_URL${N}" + if command -v curl >/dev/null 2>&1; then + curl -fL "$LINUX_BINARY_URL" -o "$NVH_HOME/bin/nvh" + elif command -v wget >/dev/null 2>&1; then + wget -O "$NVH_HOME/bin/nvh" "$LINUX_BINARY_URL" + else + echo -e "${R}Need curl or wget to download the nvHive binary.${N}" + exit 1 + fi + chmod +x "$NVH_HOME/bin/nvh" +} + +if [ -z "${NVH_HOME:-}" ]; then + if NVH_HOME="$(detect_home)"; then + echo -e "${G}Mount autopilot selected ${NVH_HOME}${N}" + else + NVH_HOME="$HOME/.nvh" + echo -e "${Y}No persistent mount was obvious; using ${NVH_HOME}.${N}" + fi +fi +export NVH_HOME +export NVH_BIN="$NVH_HOME/bin" +export PATH="$NVH_BIN:$PATH" + +echo "" +echo -e "${G}NVHive Linux Launch${N}" +echo -e "${D}Persistent home: $NVH_HOME${N}" + +if [ -f "$NVH_HOME/nvh-env.sh" ]; then + # shellcheck disable=SC1091 + source "$NVH_HOME/nvh-env.sh" +fi + +if ! command -v nvh >/dev/null 2>&1; then + if [ -x "$NVH_HOME/venv/bin/nvh" ]; then + export PATH="$NVH_HOME/venv/bin:$PATH" + elif [ "${NVH_USE_BINARY:-0}" = "1" ] || ! find_python >/dev/null; then + install_binary + else + echo -e "${G}Installing nvHive rootlessly into ${NVH_HOME}${N}" + if command -v curl >/dev/null 2>&1; then + curl -fsSL "$REPO_RAW/install.sh" | bash + elif command -v wget >/dev/null 2>&1; then + wget -qO- "$REPO_RAW/install.sh" | bash + else + echo -e "${R}Need curl or wget to install nvHive.${N}" + exit 1 + fi + [ -f "$NVH_HOME/nvh-env.sh" ] && source "$NVH_HOME/nvh-env.sh" + [ -x "$NVH_HOME/venv/bin/nvh" ] && export PATH="$NVH_HOME/venv/bin:$PATH" + fi +fi + +if ! command -v nvh >/dev/null 2>&1; then + echo -e "${R}nvh is still not on PATH after install.${N}" + echo -e "${D}Try: source \"$NVH_HOME/nvh-env.sh\"${N}" + exit 1 +fi + +echo -e "${G}Launching nvHive workstation and WebUI.${N}" +nvh workstation --home-dir "$NVH_HOME" --launch -y diff --git a/tests/test_agents.py b/tests/test_agents.py index 04142a4..5092a92 100644 --- a/tests/test_agents.py +++ b/tests/test_agents.py @@ -95,6 +95,7 @@ def test_list_presets(self): assert "security_review" in presets assert "code_review" in presets assert "product" in presets + assert "product_resilience" in presets assert "data" in presets assert "full_board" in presets @@ -115,6 +116,17 @@ def test_invalid_preset(self): with pytest.raises(ValueError, match="Unknown preset"): get_preset_agents("nonexistent", "test") + def test_product_resilience_preset_includes_underdog_advocate(self): + agents = get_preset_agents( + "product_resilience", + "Make nvHive self-healing for rootless GPU cloud desktop students", + ) + roles = [a.role for a in agents] + assert "Underdog Student Advocate" in roles + advocate = next(a for a in agents if a.role == "Underdog Student Advocate") + assert "no root access" in advocate.system_prompt + assert "mounted file volume" in advocate.system_prompt + def test_preset_agents_have_system_prompts(self): agents = get_preset_agents("engineering", "Build a REST API") for agent in agents: diff --git a/tests/test_api.py b/tests/test_api.py index 9095baa..d53f341 100644 --- a/tests/test_api.py +++ b/tests/test_api.py @@ -465,6 +465,26 @@ def test_setup_status(self, test_client: TestClient) -> None: body = resp.json() assert body["status"] == "success" + def test_setup_production_readiness(self, test_client: TestClient) -> None: + """GET /v1/setup/production-readiness returns release gates.""" + resp = test_client.get("/v1/setup/production-readiness") + assert resp.status_code == 200 + body = resp.json() + assert body["status"] == "success" + assert "production_ready" in body["data"] + assert isinstance(body["data"]["gates"], list) + + def test_setup_diagnostics(self, test_client: TestClient) -> None: + """GET /v1/setup/diagnostics returns a redacted support report.""" + resp = test_client.get("/v1/setup/diagnostics", headers={"x-request-id": "test-req"}) + assert resp.status_code == 200 + assert resp.headers["x-request-id"] == "test-req" + body = resp.json() + assert body["status"] == "success" + assert body["data"]["request_id"] == "test-req" + assert body["data"]["report_id"].startswith("diag-") + assert "logs" in body["data"] + def test_conversations_list(self, test_client: TestClient) -> None: """GET /v1/conversations returns the conversation list.""" resp = test_client.get("/v1/conversations") diff --git a/tests/test_boot_preflight.py b/tests/test_boot_preflight.py new file mode 100644 index 0000000..0978b9d --- /dev/null +++ b/tests/test_boot_preflight.py @@ -0,0 +1,130 @@ +"""Tests for boot-time VM image preflight state.""" + +from __future__ import annotations + +from nvh.integrations import boot_preflight + + +def _compatibility_report( + *, + kernel: str = "6.8.0", + cuda: str = "12.4", + agent_ready: bool = False, + node_version: str = "v22.16.0", + storage_total_gb: float = 500.0, +) -> dict: + return { + "summary": "ready", + "ready": True, + "issue_count": 0, + "blocked_count": 0, + "rootless_fixable_count": 0, + "recommended_torch_profile": "nvidia-cu121", + "host": { + "distro": "Ubuntu 24.04", + "kernel": kernel, + "machine": "x86_64", + "libc": {"name": "glibc", "version": "2.35"}, + "python": {"version": "3.11.9", "strategy": "python-venv"}, + "gpu": { + "name": "NVIDIA RTX", + "memory_total_mb": "24576", + "driver_version": "570.00", + "cuda_version": cuda, + "compute_capability": "8.9", + "architecture": "Ada Lovelace", + "detection_status": "ready", + }, + "commands": { + "git": "/usr/bin/git", + "curl": "/usr/bin/curl", + "tar": "/usr/bin/tar", + "node": "/usr/bin/node", + "npm": "/usr/bin/npm", + }, + "command_versions": {"node": node_version, "npm": "10.8.0"}, + "display": {"DISPLAY": ":0", "WAYLAND_DISPLAY": ""}, + "storage": { + "configured_by": "argument", + "total_gb": storage_total_gb, + "write_probe_ok": True, + "layout": {"home": "/mnt/nvh"}, + }, + }, + "apps": [ + { + "id": "agent-lab", + "status": "ready" if agent_ready else "fixable", + "recommended_action_id": None if agent_ready else "agent-lab", + "requirements": [], + } + ], + } + + +def test_boot_preflight_captures_baseline_and_agent_helper(tmp_path, monkeypatch) -> None: + monkeypatch.setattr( + boot_preflight, + "compatibility_report", + lambda home_dir=None: _compatibility_report(agent_ready=False), + ) + monkeypatch.setattr(boot_preflight, "mount_autopilot_report", lambda: {"recommended": None}) + monkeypatch.setattr(boot_preflight, "auto_repair_plan", lambda home_dir=None: {"actions": []}) + monkeypatch.setattr(boot_preflight, "run_safe_repairs", lambda home_dir=None: {"completed": [], "plan": {"actions": []}}) + monkeypatch.setattr(boot_preflight, "smoke_test_report", lambda home_dir=None: {"summary": "ok"}) + monkeypatch.setattr(boot_preflight, "model_fit_report", lambda home_dir=None: {"summary": "ok", "recommended_ids": [], "detected_vram_gb": 0}) + + report = boot_preflight.run_boot_preflight(home_dir=tmp_path / "nvh") + + assert report["first_run"] is True + assert report["changed"] is False + assert report["agent_helper"]["mode"] == "offline-deterministic" + assert report["agent_helper"]["recommended_action_id"] == "agent-lab" + + +def test_boot_preflight_detects_image_drift(tmp_path, monkeypatch) -> None: + reports = [ + _compatibility_report(kernel="6.8.0", cuda="12.4", agent_ready=True), + _compatibility_report(kernel="6.10.0", cuda="13.0", agent_ready=True), + ] + monkeypatch.setattr( + boot_preflight, + "compatibility_report", + lambda home_dir=None: reports.pop(0), + ) + monkeypatch.setattr(boot_preflight, "mount_autopilot_report", lambda: {"recommended": None}) + monkeypatch.setattr(boot_preflight, "auto_repair_plan", lambda home_dir=None: {"actions": []}) + monkeypatch.setattr(boot_preflight, "run_safe_repairs", lambda home_dir=None: {"completed": [], "plan": {"actions": []}}) + monkeypatch.setattr(boot_preflight, "smoke_test_report", lambda home_dir=None: {"summary": "ok"}) + monkeypatch.setattr(boot_preflight, "model_fit_report", lambda home_dir=None: {"summary": "ok", "recommended_ids": [], "detected_vram_gb": 0}) + + boot_preflight.run_boot_preflight(home_dir=tmp_path / "nvh") + changed = boot_preflight.run_boot_preflight(home_dir=tmp_path / "nvh") + + change_ids = {change["id"] for change in changed["changes"]} + assert changed["changed"] is True + assert {"kernel", "cuda_version"}.issubset(change_ids) + assert changed["agent_helper"]["mode"] == "local-agent-ready" + + +def test_boot_preflight_detects_runtime_and_storage_drift(tmp_path, monkeypatch) -> None: + reports = [ + _compatibility_report(node_version="v22.16.0", storage_total_gb=500.0, agent_ready=True), + _compatibility_report(node_version="v20.11.0", storage_total_gb=200.0, agent_ready=True), + ] + monkeypatch.setattr( + boot_preflight, + "compatibility_report", + lambda home_dir=None: reports.pop(0), + ) + monkeypatch.setattr(boot_preflight, "mount_autopilot_report", lambda: {"recommended": None}) + monkeypatch.setattr(boot_preflight, "auto_repair_plan", lambda home_dir=None: {"actions": []}) + monkeypatch.setattr(boot_preflight, "run_safe_repairs", lambda home_dir=None: {"completed": [], "plan": {"actions": []}}) + monkeypatch.setattr(boot_preflight, "smoke_test_report", lambda home_dir=None: {"summary": "ok"}) + monkeypatch.setattr(boot_preflight, "model_fit_report", lambda home_dir=None: {"summary": "ok", "recommended_ids": [], "detected_vram_gb": 0}) + + boot_preflight.run_boot_preflight(home_dir=tmp_path / "nvh") + changed = boot_preflight.run_boot_preflight(home_dir=tmp_path / "nvh") + + change_ids = {change["id"] for change in changed["changes"]} + assert {"node_version", "storage_total_gb"}.issubset(change_ids) diff --git a/tests/test_compatibility.py b/tests/test_compatibility.py new file mode 100644 index 0000000..94aee8a --- /dev/null +++ b/tests/test_compatibility.py @@ -0,0 +1,141 @@ +"""Tests for nvWizard host/app compatibility checks.""" + +from __future__ import annotations + +from types import SimpleNamespace + +from nvh.integrations import compatibility + + +def test_recommended_torch_profile_tracks_cuda_driver() -> None: + assert compatibility.recommended_torch_profile("13.0") == "nvidia-cu130" + assert compatibility.recommended_torch_profile("12.4") == "nvidia-cu121" + assert compatibility.recommended_torch_profile("11.8") == "cpu" + assert compatibility.recommended_torch_profile("") == "cpu" + + +def test_compatibility_report_marks_rootless_fixable(tmp_path, monkeypatch) -> None: + monkeypatch.setattr(compatibility.sys, "platform", "linux") + monkeypatch.setattr(compatibility.platform, "system", lambda: "Linux") + monkeypatch.setattr(compatibility.platform, "machine", lambda: "x86_64") + monkeypatch.setattr(compatibility.platform, "release", lambda: "6.8.0") + monkeypatch.setattr(compatibility.platform, "platform", lambda: "Linux") + monkeypatch.setattr(compatibility.platform, "libc_ver", lambda: ("glibc", "2.35")) + monkeypatch.setattr(compatibility, "_read_os_release", lambda: {"PRETTY_NAME": "Ubuntu 24.04"}) + monkeypatch.setattr( + compatibility, + "_nvidia_smi_query", + lambda: { + "name": "NVIDIA RTX", + "memory_total_mb": "24576", + "driver_version": "570.00", + "cuda_version": "12.4", + }, + ) + monkeypatch.setattr(compatibility, "_which", lambda cmd: f"/usr/bin/{cmd}") + monkeypatch.setattr(compatibility, "_command_version", lambda *_, **__: "ok") + monkeypatch.setattr(compatibility, "_port_open", lambda *_: False) + monkeypatch.setenv("DISPLAY", ":0") + monkeypatch.setattr( + compatibility, + "runtime_status", + lambda: SimpleNamespace( + venv_available=True, + pip_available=True, + strategy="python-venv", + ), + ) + monkeypatch.setattr( + compatibility, + "storage_status", + lambda **_: SimpleNamespace( + as_dict=lambda: { + "ok": True, + "configured_by": "argument", + "layout": {"home": str(tmp_path / "nvh")}, + }, + ), + ) + monkeypatch.setattr( + compatibility, + "model_catalog_with_status", + lambda: { + "recommended_ids": ["gemma3-4b"], + "models": [{"id": "gemma3-4b", "recommended": True, "installed": False}], + "ollama_available": False, + "ollama_running": False, + }, + ) + monkeypatch.setattr( + compatibility, + "catalog_with_status", + lambda: { + "packs": [ + {"id": "agent-lab", "status": {"installed": False}}, + ], + }, + ) + + report = compatibility.compatibility_report(home_dir=tmp_path / "nvh") + by_id = {app["id"]: app for app in report["apps"]} + + assert report["recommended_torch_profile"] == "nvidia-cu121" + assert by_id["rootless-ollama"]["status"] == "ready" + assert by_id["local-models"]["status"] == "fixable" + assert by_id["local-models"]["recommended_action_id"] == "starter-models" + assert by_id["agent-lab"]["recommended_action_id"] == "agent-lab" + + +def test_compatibility_report_blocks_missing_git_for_comfyui(tmp_path, monkeypatch) -> None: + monkeypatch.setattr(compatibility.sys, "platform", "linux") + monkeypatch.setattr(compatibility.platform, "system", lambda: "Linux") + monkeypatch.setattr(compatibility.platform, "machine", lambda: "x86_64") + monkeypatch.setattr(compatibility.platform, "release", lambda: "6.8.0") + monkeypatch.setattr(compatibility.platform, "platform", lambda: "Linux") + monkeypatch.setattr(compatibility.platform, "libc_ver", lambda: ("glibc", "2.35")) + monkeypatch.setattr(compatibility, "_read_os_release", lambda: {"PRETTY_NAME": "Ubuntu 24.04"}) + monkeypatch.setattr(compatibility, "_nvidia_smi_query", lambda: {}) + monkeypatch.setattr(compatibility, "_which", lambda cmd: None if cmd == "git" else f"/usr/bin/{cmd}") + monkeypatch.setattr(compatibility, "_command_version", lambda *_, **__: "ok") + monkeypatch.setattr(compatibility, "_port_open", lambda *_: False) + monkeypatch.setattr( + compatibility, + "runtime_status", + lambda: SimpleNamespace( + venv_available=True, + pip_available=True, + strategy="python-venv", + ), + ) + monkeypatch.setattr( + compatibility, + "storage_status", + lambda **_: SimpleNamespace( + as_dict=lambda: { + "ok": True, + "configured_by": "argument", + "layout": {"home": str(tmp_path / "nvh")}, + }, + ), + ) + monkeypatch.setattr( + compatibility, + "model_catalog_with_status", + lambda: { + "recommended_ids": [], + "models": [], + "ollama_available": True, + "ollama_running": True, + }, + ) + monkeypatch.setattr( + compatibility, + "catalog_with_status", + lambda: {"packs": [{"id": "agent-lab", "status": {"installed": True}}]}, + ) + + report = compatibility.compatibility_report(home_dir=tmp_path / "nvh") + comfy = {app["id"]: app for app in report["apps"]}["comfyui"] + + assert comfy["status"] == "blocked" + assert any(req["id"] == "git" and req["status"] == "blocked" for req in comfy["requirements"]) diff --git a/tests/test_diagnostics.py b/tests/test_diagnostics.py new file mode 100644 index 0000000..71e749f --- /dev/null +++ b/tests/test_diagnostics.py @@ -0,0 +1,76 @@ +"""Tests for redacted setup diagnostics reports.""" + +from __future__ import annotations + +import json +from pathlib import Path +from types import SimpleNamespace + +from nvh.integrations import diagnostics + + +def test_diagnostics_report_redacts_log_secrets(monkeypatch) -> None: + monkeypatch.setattr( + diagnostics, + "storage_layout", + lambda home_dir=None: SimpleNamespace( + home=Path("/persist/nvhive"), + logs_dir=Path("/persist/nvhive/logs"), + config_dir=Path("/persist/nvhive/config"), + models_dir=Path("/persist/nvhive/models"), + apps_dir=Path("/persist/nvhive/apps"), + ), + ) + monkeypatch.setattr( + diagnostics, + "storage_status", + lambda home_dir=None: SimpleNamespace(as_dict=lambda: {"ok": True, "layout": {"home": "/persist/nvhive"}}), + ) + monkeypatch.setattr(diagnostics, "_candidate_log_files", lambda logs_dir: [Path("/persist/nvhive/logs/nvhive.log")]) + monkeypatch.setattr( + diagnostics, + "_tail_log_file", + lambda path, max_lines: [ + "WARNING failed request Authorization: Bearer abcdefghijklmnop", + "ERROR provider returned sk-testsecret1234567890", + ], + ) + + report = diagnostics.diagnostics_report( + home_dir="/persist/nvhive", + request_id="req-123", + include_logs=True, + ) + rendered = json.dumps(report) + + assert report["request_id"] == "req-123" + assert report["paths"]["home"].replace("\\", "/") == "/persist/nvhive" + assert "abcdefghijklmnop" not in rendered + assert "sk-testsecret1234567890" not in rendered + assert "[redacted]" in rendered + + +def test_diagnostics_report_survives_missing_logs(monkeypatch) -> None: + monkeypatch.setattr( + diagnostics, + "storage_layout", + lambda home_dir=None: SimpleNamespace( + home=Path("/persist/nvhive"), + logs_dir=Path("/persist/nvhive/logs"), + config_dir=Path("/persist/nvhive/config"), + models_dir=Path("/persist/nvhive/models"), + apps_dir=Path("/persist/nvhive/apps"), + ), + ) + monkeypatch.setattr( + diagnostics, + "storage_status", + lambda home_dir=None: SimpleNamespace(as_dict=lambda: {"ok": True, "layout": {"home": "/persist/nvhive"}}), + ) + monkeypatch.setattr(diagnostics, "_candidate_log_files", lambda logs_dir: []) + + report = diagnostics.diagnostics_report(home_dir="/persist/nvhive", include_logs=True) + + assert report["report_id"].startswith("diag-") + assert report["logs"]["included"] is True + assert isinstance(report["checks"]["storage"], dict) diff --git a/tests/test_gpu_detection_status.py b/tests/test_gpu_detection_status.py new file mode 100644 index 0000000..2fac910 --- /dev/null +++ b/tests/test_gpu_detection_status.py @@ -0,0 +1,37 @@ +"""Tests for rootless GPU detection diagnostics.""" + +from __future__ import annotations + +from nvh.utils import gpu + + +def test_detect_gpu_status_distinguishes_blocked_devices(monkeypatch) -> None: + monkeypatch.setattr(gpu, "_detect_gpus_pynvml", lambda *, issues=None: []) + monkeypatch.setattr(gpu, "_detect_gpus_smi", lambda *, issues=None: []) + monkeypatch.setattr(gpu, "_nvidia_device_files_present", lambda: True) + monkeypatch.setattr(gpu.shutil, "which", lambda command: "/usr/bin/nvidia-smi" if command == "nvidia-smi" else None) + + status = gpu.detect_gpu_status() + + assert status["status"] == "blocked" + assert status["device_files_present"] is True + assert any(issue["code"] == "devices-present-no-query" for issue in status["issues"]) + + +def test_gpu_architecture_info_marks_name_heuristic() -> None: + info = gpu.GPUInfo( + name="NVIDIA RTX 4090", + vram_mb=24576, + vram_gb=24.0, + driver_version="570.00", + cuda_version="12.4", + utilization_pct=0, + memory_used_mb=0, + memory_free_mb=24576, + index=0, + ) + + arch = gpu.gpu_architecture_info(info) + + assert arch["architecture"] == "Ada Lovelace" + assert arch["heuristic"] is True diff --git a/tests/test_install_receipts.py b/tests/test_install_receipts.py new file mode 100644 index 0000000..05e61c6 --- /dev/null +++ b/tests/test_install_receipts.py @@ -0,0 +1,53 @@ +"""Tests for rootless install receipts.""" + +from __future__ import annotations + +from pathlib import Path + +from nvh.integrations import receipts + + +def test_write_list_and_repair_receipt(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_HOME", str(tmp_path / "nvh")) + install_path = tmp_path / "nvh" / "studio" / "packs" / "agent-lab" + install_path.mkdir(parents=True) + launcher = tmp_path / "nvh" / "bin" / "nvhive-agent-lab" + launcher.parent.mkdir(parents=True) + launcher.write_text("#!/usr/bin/env bash\n", encoding="utf-8") + + receipt = receipts.write_receipt( + kind="studio-pack", + item_id="agent-lab", + title="Agent Lab", + install_path=install_path, + launchers=[str(launcher)], + source_urls=["https://example.test/agent-lab"], + ) + + assert receipt["id"] == "studio-pack:agent-lab" + assert receipt["health"]["healthy"] is True + + listed = receipts.list_receipts() + assert [item["id"] for item in listed] == ["studio-pack:agent-lab"] + + plan = receipts.repair_plan("studio-pack:agent-lab") + assert plan["safe_to_run_without_root"] is True + assert plan["commands"] == ["nvh studio --install agent-lab -y"] + + +def test_uninstall_plan_is_preview_only(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_HOME", str(tmp_path / "nvh")) + target = Path(tmp_path / "nvh" / "comfyui" / "ComfyUI") + target.mkdir(parents=True) + + receipts.write_receipt( + kind="comfyui", + item_id="workspace", + title="ComfyUI Workspace", + install_path=target, + ) + plan = receipts.uninstall_plan("comfyui:workspace") + + assert plan["destructive"] is True + assert str(target) in plan["target_paths"] + assert target.exists() diff --git a/tests/test_live_api.py b/tests/test_live_api.py index 8da1aad..cad5522 100644 --- a/tests/test_live_api.py +++ b/tests/test_live_api.py @@ -191,6 +191,23 @@ def test_cors_preflight(self, live_server: str): allow_origin = r.headers.get("access-control-allow-origin", "") assert allow_origin == "http://nvhive" or allow_origin == "*" + def test_cors_preflight_allows_dynamic_local_webui_port(self, live_server: str): + """Fallback WebUI ports must still be able to reach the API.""" + origin = "http://127.0.0.1:3032" + r = httpx.options( + f"{live_server}/v1/advisors", + headers={ + "Origin": origin, + "Access-Control-Request-Method": "GET", + "Access-Control-Request-Headers": "content-type", + }, + timeout=5.0, + ) + + assert r.status_code in (200, 204) + allow_origin = r.headers.get("access-control-allow-origin", "") + assert allow_origin == origin or allow_origin == "*" + def test_404_on_unknown_route(self, live_server: str): """Unknown paths must 404 cleanly, not 500.""" r = httpx.get(f"{live_server}/this/does/not/exist", timeout=5.0) diff --git a/tests/test_mount_autopilot.py b/tests/test_mount_autopilot.py new file mode 100644 index 0000000..70fa795 --- /dev/null +++ b/tests/test_mount_autopilot.py @@ -0,0 +1,130 @@ +"""Tests for rootless persistent mount discovery.""" + +from __future__ import annotations + +from pathlib import Path +from typing import Any + +from nvh.integrations import mount_autopilot as autopilot + + +class _FakeStorageStatus: + def as_dict(self) -> dict[str, Any]: + return {"ok": True, "warnings": []} + + +def _stub_current_storage(monkeypatch) -> None: + monkeypatch.setattr(autopilot, "storage_status", lambda **_: _FakeStorageStatus()) + + +def _mount(path: Path, fs_type: str, source: str, options: set[str] | None = None) -> autopilot.MountInfo: + return autopilot.MountInfo( + mount_point=path, + fs_type=fs_type, + source=source, + options=options or {"rw", "relatime"}, + ) + + +def _stub_paths(monkeypatch, *paths: Path) -> None: + known = {str(path) for path in paths} + monkeypatch.setattr(autopilot, "_path_exists", lambda path: str(path) in known) + monkeypatch.setattr(autopilot, "_evidence", lambda path: []) + + +def test_evidence_ignores_permission_denied(monkeypatch) -> None: + def exists(path: Path) -> bool: + if "lost+found" in str(path): + raise PermissionError("blocked mount marker") + return False + + monkeypatch.setattr(autopilot.Path, "exists", exists) + + assert autopilot._evidence(Path("/mnt/lost+found")) == [] + + +def test_mount_autopilot_prefers_large_block_backed_home(monkeypatch) -> None: + home = Path("/home/student") + share = Path("/mnt/readonly-share") + _stub_current_storage(monkeypatch) + _stub_paths(monkeypatch, home, share) + monkeypatch.setattr(autopilot.Path, "home", lambda: home) + monkeypatch.setattr(autopilot, "_candidate_paths", lambda _: [("home", home), ("mount", share)]) + monkeypatch.setattr(autopilot, "_is_writable", lambda path: "readonly" not in str(path)) + monkeypatch.setattr( + autopilot, + "_disk_usage", + lambda path: (860.0, 1000.0) if str(path).startswith(str(home)) else (450.0, 500.0), + ) + monkeypatch.setattr( + autopilot, + "_mount_info_for_path", + lambda path: _mount(home, "ext4", "/dev/nvme1n1") + if str(path).startswith(str(home)) + else _mount(share, "cifs", "//fileserver/share", {"ro", "relatime"}), + ) + + report = autopilot.mount_autopilot_report(min_free_gb=20) + + assert report["confidence"] == "high" + assert report["recommended"]["recommended_home"] == str(home / "nvhive") + assert report["recommended"]["large_block_mount"] is True + assert "home-on-persistent-block-mount" in report["recommended"]["evidence"] + + +def test_mount_autopilot_downranks_read_only_network_mount(monkeypatch) -> None: + block = Path("/mnt/persistent-block") + share = Path("/mnt/readonly-share") + _stub_current_storage(monkeypatch) + _stub_paths(monkeypatch, block, share) + monkeypatch.setattr(autopilot, "_candidate_paths", lambda _: [("mount", share), ("mount", block)]) + monkeypatch.setattr(autopilot, "_is_writable", lambda path: path == block) + monkeypatch.setattr( + autopilot, + "_disk_usage", + lambda path: (950.0, 1000.0) if path == share else (430.0, 500.0), + ) + monkeypatch.setattr( + autopilot, + "_mount_info_for_path", + lambda path: _mount(share, "cifs", "//fileserver/share", {"ro", "relatime"}) + if path == share + else _mount(block, "xfs", "/dev/disk/by-id/nvh-persist"), + ) + + report = autopilot.mount_autopilot_report(min_free_gb=20) + share_candidate = next(candidate for candidate in report["candidates"] if candidate["path"] == str(share)) + + assert report["recommended"]["path"] == str(block) + assert share_candidate["read_only"] is True + assert share_candidate["network_mount"] is True + assert any("read-only" in warning for warning in share_candidate["warnings"]) + + +def test_mount_autopilot_downranks_os_root_disk(monkeypatch) -> None: + os_home = Path("/home/student") + persistent = Path("/mnt/persistent-block") + _stub_current_storage(monkeypatch) + _stub_paths(monkeypatch, os_home, persistent) + monkeypatch.setattr(autopilot.Path, "home", lambda: os_home) + monkeypatch.setattr(autopilot, "_candidate_paths", lambda _: [("home", os_home), ("mount", persistent)]) + monkeypatch.setattr(autopilot, "_is_writable", lambda path: True) + monkeypatch.setattr( + autopilot, + "_disk_usage", + lambda path: (860.0, 1000.0) if path == os_home else (430.0, 500.0), + ) + monkeypatch.setattr( + autopilot, + "_mount_info_for_path", + lambda path: _mount(Path("/"), "ext4", "/dev/nvme0n1") + if path == os_home + else _mount(persistent, "ext4", "/dev/nvme1n1"), + ) + + report = autopilot.mount_autopilot_report(min_free_gb=20) + os_candidate = next(candidate for candidate in report["candidates"] if candidate["path"] == str(os_home)) + + assert report["recommended"]["path"] == str(persistent) + assert os_candidate["os_mount"] is True + assert any("OS/root disk" in warning for warning in os_candidate["warnings"]) diff --git a/tests/test_production_readiness.py b/tests/test_production_readiness.py new file mode 100644 index 0000000..5f94e5c --- /dev/null +++ b/tests/test_production_readiness.py @@ -0,0 +1,158 @@ +"""Tests for conservative production readiness gates.""" + +from __future__ import annotations + +from types import SimpleNamespace + +from nvh.integrations import production_readiness + + +def _storage_status(*, ok: bool = True, configured_by: str = "argument", free_gb: float = 480.0): + return SimpleNamespace( + as_dict=lambda: { + "ok": ok, + "configured_by": configured_by, + "free_gb": free_gb, + "warnings": [] if ok else ["write probe failed"], + "layout": {"home": "/mnt/persist/nvhive"}, + } + ) + + +def _runtime_status(strategy: str = "python-venv"): + return SimpleNamespace( + as_dict=lambda: { + "strategy": strategy, + "python_version": "3.11.9", + "notes": ["ready"], + } + ) + + +def _compatibility(system: str = "Linux", *, blocked_count: int = 0, issue_count: int = 0): + return { + "summary": "Host is ready" if not issue_count else "needs attention", + "ready": issue_count == 0, + "issue_count": issue_count, + "blocked_count": blocked_count, + "rootless_fixable_count": max(0, issue_count - blocked_count), + "recommended_torch_profile": "nvidia-cu121", + "host": { + "system": system, + "gpu": { + "name": "NVIDIA RTX 4090" if system == "Linux" else "", + "driver_version": "570.00", + "cuda_version": "12.4", + }, + }, + "facts": [], + "apps": [], + } + + +def _patch_common(monkeypatch, *, system: str = "Linux", storage_ok: bool = True, target_clean: bool = True) -> None: + monkeypatch.setattr( + production_readiness, + "storage_status", + lambda home_dir=None, min_free_gb=20: _storage_status(ok=storage_ok), + ) + monkeypatch.setattr( + production_readiness, + "mount_autopilot_report", + lambda min_free_gb=20, extra_roots=None: { + "confidence": "high", + "current": {"ok": storage_ok, "configured_by": "argument"}, + "recommended": { + "recommended_home": "/mnt/persist/nvhive", + "writable": True, + "read_only": False, + "network_mount": False, + "os_mount": False, + "score": 120, + }, + }, + ) + monkeypatch.setattr(production_readiness, "runtime_status", lambda: _runtime_status()) + monkeypatch.setattr(production_readiness, "compatibility_report", lambda home_dir=None: _compatibility(system)) + monkeypatch.setattr( + production_readiness, + "boot_preflight_status", + lambda home_dir=None, run_if_missing=False: { + "checked_at": "2026-04-28T00:00:00Z", + "changed": False, + "needs_attention": not target_clean, + "summary": "Boot baseline steady", + }, + ) + monkeypatch.setattr( + production_readiness, + "smoke_test_report", + lambda home_dir=None: {"failed": 0, "warnings": 0, "summary": "8 passed"}, + ) + monkeypatch.setattr( + production_readiness, + "model_fit_report", + lambda home_dir=None: { + "storage_fits_queue": True, + "summary": "3 model(s) queued", + }, + ) + monkeypatch.setattr(production_readiness, "receipt_summary", lambda: {"count": 2, "unhealthy": 0}) + monkeypatch.setattr( + production_readiness, + "catalog_with_status", + lambda: {"packs": [{"id": "rootless-ollama", "no_root": True}, {"id": "agent-lab", "no_root": True}]}, + ) + + +def test_production_readiness_requires_target_vm_acceptance(monkeypatch) -> None: + _patch_common(monkeypatch, system="Linux") + + report = production_readiness.production_readiness_report(target_vm_validated=False) + gate_by_id = {gate["id"]: gate for gate in report["gates"]} + + assert report["pilot_ready"] is True + assert report["production_ready"] is False + assert report["status"] == "pilot-ready" + assert gate_by_id["target-vm-acceptance"]["status"] == "warn" + + +def test_production_readiness_can_pass_after_target_validation(monkeypatch) -> None: + _patch_common(monkeypatch, system="Linux") + + report = production_readiness.production_readiness_report(target_vm_validated=True) + + assert report["production_ready"] is True + assert report["status"] == "production-ready" + assert report["counts"]["blocked"] == 0 + assert report["counts"]["warnings"] == 0 + + +def test_production_readiness_blocks_when_storage_or_model_queue_fails(monkeypatch) -> None: + _patch_common(monkeypatch, system="Linux", storage_ok=False) + monkeypatch.setattr( + production_readiness, + "model_fit_report", + lambda home_dir=None: { + "storage_fits_queue": False, + "summary": "Queue needs more storage", + }, + ) + + report = production_readiness.production_readiness_report(target_vm_validated=True) + blocked = {gate["id"] for gate in report["gates"] if gate["status"] == "blocked"} + + assert report["pilot_ready"] is False + assert {"persistent-storage", "model-fit"}.issubset(blocked) + + +def test_production_readiness_on_non_target_host_is_pilot_only(monkeypatch) -> None: + _patch_common(monkeypatch, system="Windows") + + report = production_readiness.production_readiness_report(target_vm_validated=False) + gate_by_id = {gate["id"]: gate for gate in report["gates"]} + + assert report["pilot_ready"] is True + assert report["production_ready"] is False + assert gate_by_id["linux-gpu-session"]["status"] == "warn" + assert gate_by_id["linux-gpu-session"]["source"] == "target-vm" diff --git a/tests/test_setup_agent.py b/tests/test_setup_agent.py index f9d5941..7f1db91 100644 --- a/tests/test_setup_agent.py +++ b/tests/test_setup_agent.py @@ -2,11 +2,23 @@ from __future__ import annotations -from nvh.integrations import setup_agent +from types import SimpleNamespace + +from nvh.integrations import receipts, setup_agent def test_setup_helper_prioritizes_storage(tmp_path, monkeypatch) -> None: monkeypatch.delenv("NVH_HOME", raising=False) + storage = SimpleNamespace( + ok=True, + configured_by="argument", + as_dict=lambda: { + "ok": True, + "configured_by": "argument", + "layout": {"home": str(tmp_path / "nvh")}, + }, + ) + monkeypatch.setattr(setup_agent, "storage_status", lambda **_: storage) monkeypatch.setattr( setup_agent, "model_catalog_with_status", @@ -33,3 +45,67 @@ def test_setup_helper_flags_default_storage(monkeypatch) -> None: assert report["ready"] is False assert report["actions"][0]["id"] == "storage" + assert report["assistant"]["mode"] == "offline-deterministic" + + +def test_setup_assistant_answers_comfyui_question(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_HOME", str(tmp_path / "nvh")) + monkeypatch.setattr( + setup_agent, + "detect_comfyui", + lambda: {"installed": False, "examples_installed": False}, + ) + + reply = setup_agent.setup_assistant_reply("How do I install ComfyUI?", tmp_path / "nvh") + + assert reply["focus"] == "comfyui" + assert "ComfyUI" in reply["answer"] + assert reply["commands"] + + +def test_setup_helper_surfaces_unhealthy_receipt(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_HOME", str(tmp_path / "nvh")) + storage = SimpleNamespace( + ok=True, + configured_by="argument", + as_dict=lambda: { + "ok": True, + "configured_by": "argument", + "layout": {"home": str(tmp_path / "nvh")}, + }, + ) + runtime = SimpleNamespace( + strategy="python-venv", + as_dict=lambda: {"strategy": "python-venv"}, + ) + monkeypatch.setattr(setup_agent, "storage_status", lambda **_: storage) + monkeypatch.setattr(setup_agent, "runtime_status", lambda: runtime) + monkeypatch.setattr( + setup_agent, + "catalog_with_status", + lambda: { + "packs": [ + {"id": "rootless-ollama", "status": {"installed": True}}, + {"id": "blender-creative", "status": {"installed": True}}, + ], + }, + ) + monkeypatch.setattr(setup_agent, "model_catalog_with_status", lambda: {"models": []}) + monkeypatch.setattr( + setup_agent, + "detect_comfyui", + lambda: {"installed": True, "examples_installed": True}, + ) + receipts.write_receipt( + kind="studio-pack", + item_id="agent-lab", + title="Agent Lab", + install_path=tmp_path / "missing-agent-lab", + launchers=[str(tmp_path / "missing-agent-lab" / "nvhive-agent-lab")], + ) + + report = setup_agent.setup_helper_report(home_dir=tmp_path / "nvh") + + assert report["issue_count"] >= 1 + assert any(issue["id"] == "receipt:studio-pack:agent-lab" for issue in report["issues"]) + assert any(action["id"] == "repair-receipt:studio-pack:agent-lab" for action in report["actions"]) diff --git a/tests/test_setup_catalog.py b/tests/test_setup_catalog.py new file mode 100644 index 0000000..8400230 --- /dev/null +++ b/tests/test_setup_catalog.py @@ -0,0 +1,24 @@ +"""Tests for setup catalog fallback and status.""" + +from __future__ import annotations + +from nvh.integrations import catalog + + +def test_bundled_catalog_has_student_profiles() -> None: + data = catalog.bundled_catalog() + + assert data["schema_version"] == catalog.SCHEMA_VERSION + assert {profile["id"] for profile in data["profiles"]} >= {"student", "creator", "music", "full"} + assert data["models"] + assert data["comfyui_examples"] + + +def test_catalog_status_uses_bundled_without_network(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_HOME", str(tmp_path / "nvh")) + + status = catalog.catalog_status(refresh=False) + + assert status["source"] == "bundled" + assert status["profile_count"] >= 3 + assert status["model_count"] >= 1 diff --git a/tests/test_setup_experience.py b/tests/test_setup_experience.py new file mode 100644 index 0000000..15c79f2 --- /dev/null +++ b/tests/test_setup_experience.py @@ -0,0 +1,107 @@ +"""Tests for nvWizard setup experience helpers.""" + +from __future__ import annotations + +from types import SimpleNamespace + +from nvh.integrations import auto_repair, model_fit, mount_autopilot, smoke_tests + + +def test_mount_autopilot_prefers_existing_nvh_home(tmp_path, monkeypatch) -> None: + mount = tmp_path / "persistent" + home = mount / "nvhive" + (home / "receipts").mkdir(parents=True) + (home / "models").mkdir() + + monkeypatch.setattr(mount_autopilot, "_common_roots", lambda: []) + report = mount_autopilot.mount_autopilot_report(extra_roots=[mount]) + + assert report["recommended"]["recommended_home"] == str(home) + assert "receipts" in report["recommended"]["evidence"] + + +def test_auto_repair_writes_env_file_without_downloads(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_HOME", str(tmp_path / "nvh")) + monkeypatch.setattr(auto_repair, "detect_comfyui", lambda: {"installed": False}) + monkeypatch.setattr(auto_repair, "receipt_summary", lambda: {"unhealthy": 0}) + monkeypatch.setattr(auto_repair, "catalog_with_status", lambda: {"packs": [{"id": "agent-lab", "status": {"installed": True}}]}) + monkeypatch.setattr(auto_repair, "load_setup_catalog", lambda refresh=False: {"source": "bundled"}) + monkeypatch.setattr( + auto_repair, + "storage_status", + lambda home_dir=None: SimpleNamespace( + ok=True, + configured_by="argument", + layout=SimpleNamespace(home=tmp_path / "nvh"), + ), + ) + + from nvh.integrations.storage import ensure_storage + + ensure_storage(tmp_path / "nvh") + result = auto_repair.run_safe_repairs(home_dir=tmp_path / "nvh") + + assert result["errors"] == [] + assert any(item["id"] == "storage-env-file" for item in result["completed"]) + + +def test_model_fit_scores_recommended_models(monkeypatch) -> None: + monkeypatch.setattr( + model_fit, + "storage_status", + lambda home_dir=None: SimpleNamespace(as_dict=lambda: {"free_gb": 100}), + ) + monkeypatch.setattr( + model_fit, + "model_catalog_with_status", + lambda: { + "detected_vram_gb": 12, + "ollama_available": True, + "ollama_running": True, + "models": [ + { + "id": "fast-chat", + "priority": 10, + "recommended": True, + "fits_vram": True, + "installed": False, + "estimated_disk_gb": 4, + "capabilities": ["chat", "fast"], + } + ], + }, + ) + + report = model_fit.model_fit_report() + + assert report["recommended_ids"] == ["fast-chat"] + assert report["models"][0]["fit_score"] > 100 + + +def test_smoke_tests_surface_comfyui_example_repair(monkeypatch, tmp_path) -> None: + monkeypatch.setattr( + smoke_tests, + "storage_status", + lambda home_dir=None: SimpleNamespace( + ok=True, + configured_by="argument", + env_file=tmp_path / "nvh-env.sh", + layout=SimpleNamespace(home=tmp_path), + ), + ) + monkeypatch.setattr( + smoke_tests, + "catalog_with_status", + lambda: {"packs": [{"id": "agent-lab", "status": {"installed": True}}]}, + ) + monkeypatch.setattr( + smoke_tests, + "detect_comfyui", + lambda: {"installed": True, "running": False, "examples_installed": False, "app_dir": str(tmp_path), "examples_dir": str(tmp_path / "examples")}, + ) + + report = smoke_tests.smoke_test_report(home_dir=str(tmp_path)) + by_id = {item["id"]: item for item in report["tests"]} + + assert by_id["comfyui-examples"]["status"] == "warn" + assert by_id["comfyui-examples"]["action_id"] == "comfyui-examples" diff --git a/tests/test_storage_integration.py b/tests/test_storage_integration.py index 8c2fb50..64b41cf 100644 --- a/tests/test_storage_integration.py +++ b/tests/test_storage_integration.py @@ -22,6 +22,8 @@ def test_ensure_storage_creates_canonical_layout(tmp_path, monkeypatch) -> None: assert status.layout.comfyui_dir.is_dir() assert status.layout.config_dir.is_dir() assert status.env_file.exists() + assert status.write_probe_ok is True + assert status.write_probe_error == "" assert "NVH_HOME" in status.layout.env() assert "NVH_RUNTIME_HOME" in status.layout.env() assert "NVH_APPS_HOME" in status.layout.env() diff --git a/tests/test_studio_packs.py b/tests/test_studio_packs.py index 0a608db..698ab0d 100644 --- a/tests/test_studio_packs.py +++ b/tests/test_studio_packs.py @@ -2,6 +2,9 @@ from __future__ import annotations +import inspect +import sys + import pytest from nvh.integrations import studio_packs @@ -14,9 +17,19 @@ def test_catalog_is_rootless_and_grouped() -> None: assert "rootless-ollama" in ids assert "llm-starter" in ids assert "agent-lab" in ids + assert "nvidia-omni-agent" in ids + assert "openclaw-agent" in ids + assert "nemoclaw-sandbox" in ids assert "comfyui-power-nodes" in ids assert "game-dev-lab" in ids assert "blender-creative" in ids + assert "godot-engine" in ids + assert "unity-hub-helper" in ids + assert "unreal-engine-helper" in ids + assert "github-login-helper" in ids + assert "ace-step-music" in ids + assert "music-producer-lab" in ids + assert "music-daw-helper" in ids assert all(pack["no_root"] for pack in catalog) @@ -26,10 +39,29 @@ def test_pack_bundles_expand_without_duplicates() -> None: assert starter[0] == "rootless-ollama" assert "llm-starter" in starter assert "agent-lab" in starter + assert "nvidia-omni-agent" in starter assert len(starter) == len(set(starter)) + assert "github-login-helper" in starter creative = studio_packs.expand_pack_ids(["creative"]) - assert creative == ["blender-creative", "game-dev-lab", "game-mod-helper"] + assert creative == ["blender-creative", "game-dev-lab", "game-mod-helper", "godot-engine"] + + game = studio_packs.expand_pack_ids(["game"]) + assert "godot-engine" in game + assert "unity-hub-helper" in game + assert "unreal-engine-helper" in game + assert "github-login-helper" in game + + claw = studio_packs.expand_pack_ids(["claw"]) + assert claw == ["openclaw-agent", "nemoclaw-sandbox"] + + agents = studio_packs.expand_pack_ids(["agents"]) + assert "agent-lab" in agents + assert "nvidia-omni-agent" in agents + assert "openclaw-agent" in agents + + music = studio_packs.expand_pack_ids(["music"]) + assert music == ["ace-step-music", "music-producer-lab", "music-daw-helper", "github-login-helper"] def test_model_catalog_marks_vram_recommendations(monkeypatch) -> None: @@ -46,6 +78,70 @@ def test_model_catalog_marks_vram_recommendations(monkeypatch) -> None: assert by_id["deepseek-r1-8b"]["fits_vram"] is False +def test_godot_asset_selector_prefers_standard_linux_zip() -> None: + release = { + "assets": [ + {"name": "Godot_v4.5-stable_mono_linux_x86_64.zip", "browser_download_url": "https://example.invalid/mono.zip"}, + {"name": "Godot_v4.5-stable_export_templates.tpz", "browser_download_url": "https://example.invalid/templates.tpz"}, + {"name": "Godot_v4.5-stable_linux.x86_64.zip", "browser_download_url": "https://example.invalid/godot.zip"}, + ] + } + + asset = studio_packs._select_godot_asset(release) + + assert asset["browser_download_url"].endswith("godot.zip") + + +def test_appimage_selector_prefers_linux_x64_assets() -> None: + release = { + "assets": [ + {"name": "audacity-linux-3.7.7-x64-20.04.AppImage", "browser_download_url": "https://example.invalid/audacity-20.AppImage"}, + {"name": "audacity-linux-3.7.7-x64-22.04.AppImage", "browser_download_url": "https://example.invalid/audacity-22.AppImage"}, + {"name": "audacity-linux-3.7.7-aarch64.AppImage", "browser_download_url": "https://example.invalid/audacity-arm.AppImage"}, + ] + } + + asset = studio_packs._select_appimage_asset( + release, + app_name="Audacity", + required_tokens=("linux",), + preferred_tokens=("22.04", "x64"), + ) + + assert asset["browser_download_url"].endswith("audacity-22.AppImage") + + +def test_run_command_supports_long_install_timeout() -> None: + signature = inspect.signature(studio_packs._run_command) + + assert "timeout" in signature.parameters + + +@pytest.mark.asyncio +async def test_ace_step_music_blocks_non_linux(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_STUDIO_HOME", str(tmp_path / "studio")) + monkeypatch.setattr(studio_packs.platform, "system", lambda: "Darwin") + + pack = studio_packs._find_pack("ace-step-music") + events = [event async for event in studio_packs._install_ace_step_music(pack, force_update=False)] + + assert events[0]["event"] == "error" + assert not (tmp_path / "studio" / "packs" / "ace-step-music" / "installed.json").exists() + + +@pytest.mark.asyncio +async def test_music_daw_helper_does_not_mark_installed_without_downloads(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_STUDIO_HOME", str(tmp_path / "studio")) + monkeypatch.setattr(studio_packs.platform, "system", lambda: "Linux") + monkeypatch.setitem(sys.modules, "httpx", None) + + pack = studio_packs._find_pack("music-daw-helper") + events = [event async for event in studio_packs._install_music_daw_helper(pack, force_update=False)] + + assert any(event["event"] == "error" for event in events) + assert not (tmp_path / "studio" / "packs" / "music-daw-helper" / "installed.json").exists() + + def test_catalog_status_uses_configured_studio_home(tmp_path, monkeypatch) -> None: monkeypatch.setenv("NVH_STUDIO_HOME", str(tmp_path)) @@ -68,6 +164,37 @@ def test_blender_pack_status_uses_persistent_apps_home(tmp_path, monkeypatch) -> assert status["details"]["version"] == studio_packs.BLENDER_VERSION +def test_claw_status_marks_nemoclaw_blocked_without_docker(tmp_path, monkeypatch) -> None: + monkeypatch.setenv("NVH_HOME", str(tmp_path / "nvh")) + monkeypatch.setattr(studio_packs, "_node_runtime_status", lambda env=None: { + "node": "/tmp/node", + "npm": "/tmp/npm", + "node_version": "v22.16.0", + "npm_version": "10.9.0", + "node_ok": True, + "npm_ok": True, + "ready": True, + "can_auto_install": True, + "minimum_node": "22.16.0", + "minimum_npm": "10.0.0", + }) + monkeypatch.setattr(studio_packs, "_docker_status", lambda: { + "binary": "", + "ready": False, + "detail": "Docker was not found on PATH.", + "rootless_hint": "Ask the provider to enable rootless Docker.", + }) + monkeypatch.setattr(studio_packs, "_nemoclaw_binary_from_env", lambda env=None: "") + + openclaw = studio_packs.pack_status(studio_packs._find_pack("openclaw-agent")) + nemoclaw = studio_packs.pack_status(studio_packs._find_pack("nemoclaw-sandbox")) + + assert openclaw["details"]["installable"] is True + assert nemoclaw["installed"] is False + assert nemoclaw["details"]["installable"] is False + assert "Docker" in nemoclaw["details"]["blocked_reason"] + + @pytest.mark.asyncio async def test_comfy_nodes_skip_without_comfyui(tmp_path, monkeypatch) -> None: monkeypatch.setenv("NVH_STUDIO_HOME", str(tmp_path / "studio")) diff --git a/tests/test_webui_bootstrap.py b/tests/test_webui_bootstrap.py new file mode 100644 index 0000000..3b407f0 --- /dev/null +++ b/tests/test_webui_bootstrap.py @@ -0,0 +1,45 @@ +from __future__ import annotations + +from types import SimpleNamespace + +import nvh.cli.main as cli_main + + +class _ConsoleSpy: + def __init__(self) -> None: + self.input_called = False + self.messages: list[str] = [] + + def input(self, _prompt: str) -> str: + self.input_called = True + return "n" + + def print(self, *args: object, **_kwargs: object) -> None: + self.messages.append(" ".join(str(arg) for arg in args)) + + +def test_rootless_node_install_assume_yes_skips_prompt(monkeypatch, tmp_path): + import shutil + import subprocess + + import nvh.integrations.storage as storage_mod + + layout = SimpleNamespace( + runtime_dir=tmp_path / "runtimes", + env=lambda: {}, + ) + console = _ConsoleSpy() + + monkeypatch.setattr(cli_main.sys, "platform", "linux") + monkeypatch.setattr(storage_mod, "storage_layout", lambda: layout) + monkeypatch.setattr(shutil, "which", lambda _name: None) + monkeypatch.setattr( + subprocess, + "run", + lambda *_args, **_kwargs: SimpleNamespace(returncode=1, stderr="offline"), + ) + + node, npm = cli_main._try_install_node_no_root(console, assume_yes=True) + + assert (node, npm) == (None, None) + assert console.input_called is False diff --git a/tests/test_webui_e2e.py b/tests/test_webui_e2e.py index 2010c55..25e456a 100644 --- a/tests/test_webui_e2e.py +++ b/tests/test_webui_e2e.py @@ -195,28 +195,27 @@ def test_new_chat_from_other_page(self, page): # --------------------------------------------------------------------------- class TestWebUISetup: - def test_setup_wizard_steps(self, page): - """Setup wizard should show step indicators.""" + def test_setup_installer_home_loads(self, page): + """Setup home should show compact system check and install options.""" page.goto(f"{BASE}/setup") page.wait_for_load_state("networkidle") page.wait_for_timeout(2000) - # Should show step 1 (Welcome) - assert page.locator("text=Welcome").count() > 0 + assert page.locator("text=System Check").count() > 0 + assert page.locator("text=Install Options").count() > 0 - def test_setup_next_button(self, page): - """NEXT button should advance to next step.""" + def test_setup_advanced_details_reveals_controls(self, page): + """Advanced Details should reveal the deeper setup controls.""" page.goto(f"{BASE}/setup") page.wait_for_load_state("networkidle") page.wait_for_timeout(2000) - next_btn = page.locator('button:has-text("NEXT")') - if next_btn.count() > 0: - next_btn.click() + details_btn = page.locator('button:has-text("Advanced Details")') + if details_btn.count() > 0: + details_btn.click() page.wait_for_timeout(1000) - # Should have advanced (GPU step or Local AI step) content = page.content() - assert "GPU" in content or "Local" in content or "Step" in content + assert "Storage Autopilot" in content or "LLM Picks" in content or "Mission Summary" in content # --------------------------------------------------------------------------- diff --git a/web/app/globals.css b/web/app/globals.css index 3526f79..5b6eb84 100644 --- a/web/app/globals.css +++ b/web/app/globals.css @@ -281,6 +281,17 @@ button { min-height: 100vh; } +@media (max-width: 900px) { + .layout-with-sidebar { + display: block; + min-height: calc(100vh - 2rem); + } + + .layout-with-sidebar > aside { + display: none; + } +} + /* ============================================================ NVIDIA green accent left-border for active nav items ============================================================ */ diff --git a/web/app/setup/page.tsx b/web/app/setup/page.tsx index 5b13e69..60f5635 100644 --- a/web/app/setup/page.tsx +++ b/web/app/setup/page.tsx @@ -1,6 +1,7 @@ 'use client'; import { useState, useEffect, useCallback } from 'react'; +import Image from 'next/image'; import Link from 'next/link'; import { checkHealth, @@ -9,9 +10,19 @@ import { getRecommendations, getFreeProviders, saveProviderKey, + askSetupAssistant, getStorageStatus, configureStorage, + getMountAutopilot, + activateMountAutopilot, + getSetupCatalog, + getSetupBootPreflight, + getSetupMissionControl, + getSetupProductionReadiness, + getSetupDiagnostics, getSetupHelper, + getSetupReceipts, + repairSetupWorkspace, cancelInstallJob, getComfyUIStatus, getComfyUIExamples, @@ -32,8 +43,19 @@ import type { ComfyUIExample, ComfyUIInstallEvent, ComfyUIStatus, + ComfyUITorchProfile, + BootPreflightReport, + CompatibilityReport, InstallJob, + InstallReceipt, + MissionControlReport, + ProductionReadinessReport, + DiagnosticsReport, + SetupAssistantReply, + SetupCatalogResult, SetupHelperReport, + SetupReceiptsResult, + MountAutopilotReport, StorageStatus, StudioPack, StudioPackInstallEvent, @@ -42,7 +64,16 @@ import type { } from '@/lib/types'; type Step = 'welcome' | 'storage' | 'gpu' | 'models' | 'local-ai' | 'studio' | 'comfyui' | 'cloud' | 'test' | 'done'; -type WizardProfile = 'student' | 'creator' | 'game' | 'full'; +type WizardProfile = 'student' | 'llm' | 'creator' | 'agent' | 'game' | 'music' | 'full'; + +type SetupCheckState = 'ready' | 'warn' | 'fix' | 'checking'; + +const CHECK_TONES: Record = { + ready: { dot: 'bg-[#76B900]', text: 'text-[#76B900]', border: 'border-[#76B900]/30', bg: 'bg-[#76B900]/5', label: 'Ready' }, + warn: { dot: 'bg-[#d97706]', text: 'text-[#d97706]', border: 'border-[#d97706]/30', bg: 'bg-[#fff7ed]', label: 'Review' }, + fix: { dot: 'bg-[#d97706]', text: 'text-[#d97706]', border: 'border-[#d97706]/30', bg: 'bg-[#fff7ed]', label: 'Fix queued' }, + checking: { dot: 'bg-[#a3a3a3]', text: 'text-[#737373]', border: 'border-[#e5e5e5]', bg: 'bg-[#fafafa]', label: 'Checking' }, +}; const STEPS: { id: Step; label: string; num: number }[] = [ { id: 'welcome', label: 'Welcome', num: 1 }, @@ -66,6 +97,37 @@ const CLOUD_PROVIDERS = [ { id: 'mistral', name: 'Mistral', description: 'Mistral Large, Small', envKey: 'MISTRAL_API_KEY', placeholder: 'your-key...', signupUrl: 'https://console.mistral.ai/api-keys' }, ]; +type BrandLogoId = 'openclaw' | 'nvidia' | 'comfyui' | 'blender' | 'godot' | 'github' | 'unity' | 'unreal' | 'ollama' | 'audacity' | 'lmms'; + +const BRAND_LOGOS: Record = { + openclaw: { src: '/brand-icons/openclaw.svg', alt: 'OpenClaw logo' }, + nvidia: { src: '/brand-icons/nvidia.svg', alt: 'NVIDIA logo' }, + comfyui: { src: '/brand-icons/comfyui.svg', alt: 'ComfyUI logo' }, + blender: { src: '/brand-icons/blender.svg', alt: 'Blender logo' }, + godot: { src: '/brand-icons/godot.svg', alt: 'Godot Engine logo' }, + github: { src: '/brand-icons/github.svg', alt: 'GitHub logo' }, + unity: { src: '/brand-icons/unity.svg', alt: 'Unity logo' }, + unreal: { src: '/brand-icons/unrealengine.svg', alt: 'Unreal Engine logo' }, + ollama: { src: '/brand-icons/ollama.svg', alt: 'Ollama logo' }, + audacity: { src: '/brand-icons/audacity.svg', alt: 'Audacity logo' }, + lmms: { src: '/brand-icons/lmms.svg', alt: 'LMMS logo' }, +}; + +function BrandLogo({ id, className = 'w-7 h-7' }: { id: BrandLogoId; className?: string }) { + const logo = BRAND_LOGOS[id]; + return ( + + ); +} + // Provider Card (used in Cloud step) interface ProviderCardProps { @@ -188,6 +250,37 @@ function ProviderCard({ p, expandedProvider, setExpandedProvider, keyInputs, set const isActiveInstallJob = (job: InstallJob) => job.status === 'queued' || job.status === 'running'; +const studioPackDetails = (pack: StudioPack): Record => pack.status?.details ?? {}; + +const studioPackInstallable = (pack: StudioPack) => studioPackDetails(pack).installable !== false; + +const studioPackBlockedReason = (pack: StudioPack) => { + const reason = studioPackDetails(pack).blocked_reason; + return typeof reason === 'string' ? reason : ''; +}; + +const selectableStudioPackIds = (packs: StudioPack[], packIds: string[]) => ( + packIds.filter(packId => { + const pack = packs.find(item => item.id === packId); + return !pack || studioPackInstallable(pack); + }) +); + +const shouldAutoActivateStorage = (status: StorageStatus, report: MountAutopilotReport) => { + const candidate = report.recommended; + const needsStorage = !status.ok || status.configured_by === 'default'; + return Boolean( + needsStorage && + candidate && + ['high', 'medium'].includes(report.confidence) && + candidate.writable && + (candidate.large_block_mount || (candidate.total_gb ?? 0) >= 180 || candidate.source.startsWith('env:')) && + !candidate.read_only && + !candidate.network_mount && + !candidate.os_mount + ); +}; + export default function SetupPage() { const [step, setStep] = useState('welcome'); const [apiKeys, setApiKeys] = useState>({}); @@ -204,8 +297,28 @@ export default function SetupPage() { const [storageHomeInput, setStorageHomeInput] = useState(''); const [storageSaving, setStorageSaving] = useState(false); const [storageError, setStorageError] = useState(null); + const [mountAutopilot, setMountAutopilot] = useState(null); + const [mountActivating, setMountActivating] = useState(false); const [setupHelper, setSetupHelper] = useState(null); const [setupHelperError, setSetupHelperError] = useState(null); + const [setupReceipts, setSetupReceipts] = useState(null); + const [setupCatalog, setSetupCatalog] = useState(null); + const [setupCompatibility, setSetupCompatibility] = useState(null); + const [bootPreflight, setBootPreflight] = useState(null); + const [missionControl, setMissionControl] = useState(null); + const [productionReadiness, setProductionReadiness] = useState(null); + const [diagnosticsReport, setDiagnosticsReport] = useState(null); + const [diagnosticsLoading, setDiagnosticsLoading] = useState(false); + const [diagnosticsMessage, setDiagnosticsMessage] = useState(null); + const [diagnosticsError, setDiagnosticsError] = useState(null); + const [workspaceRepairing, setWorkspaceRepairing] = useState(false); + const [setupInventoryError, setSetupInventoryError] = useState(null); + const [activeWizardBuild, setActiveWizardBuild] = useState(null); + const [wizardBuildMessage, setWizardBuildMessage] = useState(null); + const [assistantQuestion, setAssistantQuestion] = useState(''); + const [assistantReply, setAssistantReply] = useState(null); + const [assistantLoading, setAssistantLoading] = useState(false); + const [assistantError, setAssistantError] = useState(null); const [expandedProvider, setExpandedProvider] = useState(null); const [keyInputs, setKeyInputs] = useState>({}); const [savingKey, setSavingKey] = useState(null); @@ -242,6 +355,8 @@ export default function SetupPage() { const [installJobs, setInstallJobs] = useState([]); const [jobsError, setJobsError] = useState(null); const [cancelingJobId, setCancelingJobId] = useState(null); + const [advancedSetupOpen, setAdvancedSetupOpen] = useState(false); + const [selectedWizardProfile, setSelectedWizardProfile] = useState('student'); // Live-polled provider health drives Ollama status and the // configured-providers list so the setup screen reflects newly @@ -283,13 +398,62 @@ export default function SetupPage() { } }, []); + const refreshSetupInventory = useCallback(async (refreshCatalog = false, homeDir?: string) => { + try { + const activeHome = homeDir ?? storageStatus?.layout.home; + const [receipts, catalog, boot, mission, readiness] = await Promise.all([ + getSetupReceipts({ limit: 8 }), + getSetupCatalog(refreshCatalog), + getSetupBootPreflight(activeHome), + getSetupMissionControl(activeHome), + getSetupProductionReadiness(activeHome), + ]); + setSetupReceipts(receipts); + setSetupCatalog(catalog); + setBootPreflight(boot); + setSetupCompatibility(boot.compatibility); + setMissionControl(mission); + setProductionReadiness(readiness); + setSetupInventoryError(null); + } catch (err) { + setSetupInventoryError(err instanceof Error ? err.message : 'Could not load setup inventory'); + } + }, [storageStatus?.layout.home]); + + const handleDiagnosticsReport = async () => { + if (diagnosticsLoading) return; + setDiagnosticsLoading(true); + setDiagnosticsError(null); + setDiagnosticsMessage(null); + try { + const report = await getSetupDiagnostics(storageStatus?.layout.home, true); + setDiagnosticsReport(report); + const reportText = JSON.stringify(report, null, 2); + if (navigator.clipboard?.writeText) { + try { + await navigator.clipboard.writeText(reportText); + setDiagnosticsMessage(`Copied redacted report ${report.report_id}`); + } catch { + setDiagnosticsMessage(`Report ${report.report_id} is ready below`); + } + } else { + setDiagnosticsMessage(`Report ${report.report_id} is ready below`); + } + } catch (err) { + setDiagnosticsError(err instanceof Error ? err.message : 'Could not build diagnostics report'); + } finally { + setDiagnosticsLoading(false); + } + }; + useEffect(() => { void refreshInstallJobs(); + void refreshSetupInventory(false); const timer = window.setInterval(() => { void refreshInstallJobs(); }, 3000); return () => window.clearInterval(timer); - }, [refreshInstallJobs]); + }, [refreshInstallJobs, refreshSetupInventory]); useEffect(() => { setComfyInstalling(installJobs.some(job => job.kind === 'comfyui-install' && isActiveInstallJob(job))); @@ -311,12 +475,73 @@ export default function SetupPage() { } }; + const handleAskAssistant = async () => { + const question = assistantQuestion.trim(); + if (!question) return; + setAssistantLoading(true); + setAssistantError(null); + try { + const reply = await askSetupAssistant(question, storageStatus?.layout.home); + setAssistantReply(reply); + } catch (err) { + setAssistantError(err instanceof Error ? err.message : 'Setup helper could not answer'); + } finally { + setAssistantLoading(false); + } + }; + + const handleBootRecheck = async () => { + try { + const boot = await getSetupBootPreflight(storageStatus?.layout.home, true); + setBootPreflight(boot); + setSetupCompatibility(boot.compatibility); + setMissionControl(await getSetupMissionControl(storageStatus?.layout.home)); + setSetupInventoryError(null); + void refreshSetupHelper(storageStatus?.layout.home); + } catch (err) { + setSetupInventoryError(err instanceof Error ? err.message : 'Boot preflight could not run'); + } + }; + + const handleRepairWorkspace = async () => { + if (workspaceRepairing) return; + setWorkspaceRepairing(true); + try { + await repairSetupWorkspace(storageStatus?.layout.home); + await Promise.all([ + refreshSetupInventory(false, storageStatus?.layout.home), + refreshSetupHelper(storageStatus?.layout.home), + refreshComfyUI(), + refreshInstallJobs(), + ]); + setSetupInventoryError(null); + } catch (err) { + setSetupInventoryError(err instanceof Error ? err.message : 'Workspace repair failed'); + } finally { + setWorkspaceRepairing(false); + } + }; + useEffect(() => { - // Check API health - checkHealth() - .then(() => setApiStatus('connected')) - .catch(() => setApiStatus('disconnected')); + let cancelled = false; + let healthRetryTimer: ReturnType | null = null; + let storageRetryTimer: ReturnType | null = null; + + const pollApiHealth = async (retry = false) => { + if (cancelled) return; + if (!retry) setApiStatus('checking'); + try { + await checkHealth(); + if (!cancelled) setApiStatus('connected'); + } catch { + if (!cancelled) { + setApiStatus('disconnected'); + healthRetryTimer = setTimeout(() => void pollApiHealth(true), 5000); + } + } + }; + void pollApiHealth(); // Ollama status + configured-providers list are now fed by the // polled useProviderHealth hook below, so nothing to do here at @@ -330,15 +555,55 @@ export default function SetupPage() { .finally(() => setFreeProvidersLoading(false)); // Fetch persistent storage preflight before any large local downloads. - getStorageStatus() - .then(status => { + const loadStorage = async () => { + if (cancelled) return; + try { + const status = await getStorageStatus(); + if (cancelled) return; setStorageStatus(status); setStorageHomeInput(status.layout.home); + setStorageError(null); void refreshSetupHelper(status.layout.home); - }) - .catch(() => { - setStorageError('Storage preflight is unavailable. Start the API with nvh serve.'); - }); + + if (!status.ok || status.configured_by === 'default') { + try { + const report = await getMountAutopilot(20); + if (cancelled) return; + setMountAutopilot(report); + if (report.recommended) { + setStorageHomeInput(report.recommended.recommended_home); + } + if (shouldAutoActivateStorage(status, report)) { + setMountActivating(true); + try { + const activated = await activateMountAutopilot(report.recommended?.recommended_home, 20); + if (cancelled) return; + setStorageStatus(activated.storage); + setStorageHomeInput(activated.storage.layout.home); + setMountAutopilot(activated.mount_autopilot); + setWizardBuildMessage('nvWizard found the large writable block volume and prepared it for models, ComfyUI, Blender, and agents.'); + void refreshSetupHelper(activated.storage.layout.home); + void refreshSetupInventory(false, activated.storage.layout.home); + } catch (err) { + setStorageError(err instanceof Error ? err.message : 'Could not activate recommended persistent storage'); + } finally { + setMountActivating(false); + } + } + } catch { + if (!cancelled) { + setStorageError('Mount autopilot could not inspect persistent volumes yet. You can still paste NVH_HOME.'); + } + } + } + } catch { + if (!cancelled) { + setStorageError('Storage preflight is unavailable. Start the API with nvh serve.'); + storageRetryTimer = setTimeout(() => void loadStorage(), 5000); + } + } + }; + void loadStorage(); // Fetch GPU info for the GPU step setGpuLoading(true); @@ -377,7 +642,8 @@ export default function SetupPage() { setStudioPacks(data.packs); setStudioBundles(data.bundles); setStudioRoot(data.root); - setSelectedStudioPacks(new Set(data.bundles.starter ?? data.packs.map(pack => pack.id))); + const starterIds = data.bundles.starter ?? data.packs.map(pack => pack.id); + setSelectedStudioPacks(new Set(selectableStudioPackIds(data.packs, starterIds))); }) .catch(() => {}) .finally(() => setStudioLoading(false)); @@ -392,7 +658,13 @@ export default function SetupPage() { }) .catch(() => {}) .finally(() => setModelsLoading(false)); - }, [refreshSetupHelper]); + + return () => { + cancelled = true; + if (healthRetryTimer) clearTimeout(healthRetryTimer); + if (storageRetryTimer) clearTimeout(storageRetryTimer); + }; + }, [refreshSetupHelper, refreshSetupInventory]); const handleTest = async () => { setTestLoading(true); @@ -452,6 +724,7 @@ export default function SetupPage() { refreshStudioPacks(), refreshComfyUI(), refreshSetupHelper(status.layout.home), + refreshSetupInventory(false, status.layout.home), ]); } catch (err) { setStorageError(err instanceof Error ? err.message : 'Could not configure persistent storage'); @@ -487,7 +760,8 @@ export default function SetupPage() { setStudioRoot(data.root); setSelectedStudioPacks(prev => { if (prev.size > 0) return prev; - return new Set(data.bundles.starter ?? data.packs.map(pack => pack.id)); + const starterIds = data.bundles.starter ?? data.packs.map(pack => pack.id); + return new Set(selectableStudioPackIds(data.packs, starterIds)); }); } catch { // keep current pack state @@ -497,6 +771,11 @@ export default function SetupPage() { }; const toggleStudioPack = (packId: string) => { + const pack = studioPacks.find(item => item.id === packId); + if (pack && !studioPackInstallable(pack)) { + setStudioError(studioPackBlockedReason(pack) || `${pack.title} is blocked on this host.`); + return; + } setSelectedStudioPacks(prev => { const next = new Set(prev); if (next.has(packId)) next.delete(packId); @@ -507,51 +786,96 @@ export default function SetupPage() { const selectStudioBundle = (bundleId: string) => { const packIds = studioBundles[bundleId] ?? []; - setSelectedStudioPacks(new Set(packIds)); + setSelectedStudioPacks(new Set(selectableStudioPackIds(studioPacks, packIds))); }; - const applyWizardProfile = (profile: WizardProfile) => { - const recommendedModels = studioModels + const expandStudioPackGroups = ( + groups: string[], + packs: StudioPack[] = studioPacks, + bundles: Record = studioBundles, + ) => ( + selectableStudioPackIds( + packs, + groups.flatMap(group => bundles[group] ?? [group]) + ) + ); + + const wizardProfilePackIds = ( + profile: WizardProfile, + packs: StudioPack[] = studioPacks, + bundles: Record = studioBundles, + ) => { + if (profile === 'student') { + return expandStudioPackGroups(['rootless-ollama', 'agent-lab', 'nvidia-omni-agent'], packs, bundles); + } + if (profile === 'llm') { + return expandStudioPackGroups(['rootless-ollama'], packs, bundles); + } + if (profile === 'creator') { + return expandStudioPackGroups(['rootless-ollama', 'creative', 'comfy', 'github-login-helper'], packs, bundles); + } + if (profile === 'agent') { + return expandStudioPackGroups(['rootless-ollama', 'agents', 'claw'], packs, bundles); + } + if (profile === 'game') { + return expandStudioPackGroups(['rootless-ollama', 'game', 'creative', 'comfy'], packs, bundles); + } + if (profile === 'music') { + return expandStudioPackGroups(['rootless-ollama', 'music'], packs, bundles); + } + return expandStudioPackGroups(['all'], packs, bundles) + .filter(packId => packId !== 'llm-starter' && packId !== 'llm-coder-reasoner'); + }; + + const wizardProfileStep = (profile: WizardProfile): Step => { + if (profile === 'creator' || profile === 'game') return 'comfyui'; + if (profile === 'student' || profile === 'llm' || profile === 'agent' || profile === 'music') return 'studio'; + return 'models'; + }; + + const wizardProfileNeedsComfy = (profile: WizardProfile) => ( + profile === 'creator' || profile === 'game' || profile === 'full' + ); + + const wizardProfileModelIds = (profile: WizardProfile, models: StudioModel[] = studioModels) => { + const recommendedModels = models .filter(model => model.recommended) .map(model => model.id); - const allModelIds = studioModels + const allModelIds = models .filter(model => model.recommended || model.fits_vram) .map(model => model.id); - const vramLimit = detectedModelVram || 12; - const starterPackIds = studioBundles.starter ?? studioPacks.map(pack => pack.id); - const creativePackIds = studioBundles.creative ?? ['blender-creative', 'game-dev-lab', 'game-mod-helper']; - const starterExamples = visibleComfyExamples - .filter(example => example.recommended_vram_gb <= vramLimit) - .map(example => example.id); - if (profile === 'student') { - setSelectedStudioModels(new Set(recommendedModels)); - setSelectedStudioPacks(new Set(starterPackIds)); - setSelectedComfyExamples(new Set(starterExamples)); - setStep('models'); - return; + if (profile === 'agent') { + const agentModels = models + .filter(model => ['code', 'embedding'].includes(model.category)) + .filter(model => model.recommended || model.fits_vram) + .map(model => model.id); + return agentModels.length ? agentModels : recommendedModels; } - if (profile === 'creator') { - setSelectedStudioModels(new Set(recommendedModels)); - setSelectedStudioPacks(new Set([...(studioBundles.comfy ?? ['comfyui-power-nodes']), ...creativePackIds])); - setSelectedComfyExamples(new Set(starterExamples)); - setStep('comfyui'); - return; + if (profile === 'full') { + return allModelIds.length ? allModelIds : recommendedModels; } - if (profile === 'game') { - setSelectedStudioModels(new Set(recommendedModels)); - setSelectedStudioPacks(new Set(creativePackIds)); - setSelectedComfyExamples(new Set(starterExamples)); - setStep('studio'); - return; - } + return recommendedModels; + }; + + const wizardProfileExampleIds = ( + examples: ComfyUIExample[] = visibleComfyExamples, + vramGb: number = detectedModelVram, + ) => { + const vramLimit = vramGb || 12; + const starterExamples = examples + .filter(example => example.recommended_vram_gb <= vramLimit) + .map(example => example.id); + return starterExamples; + }; - setSelectedStudioModels(new Set(allModelIds.length ? allModelIds : recommendedModels)); - setSelectedStudioPacks(new Set(studioBundles.all ?? studioPacks.map(pack => pack.id))); - setSelectedComfyExamples(new Set(starterExamples)); - setStep('models'); + const applyWizardProfile = (profile: WizardProfile) => { + setSelectedStudioModels(new Set(wizardProfileModelIds(profile))); + setSelectedStudioPacks(new Set(wizardProfilePackIds(profile))); + setSelectedComfyExamples(new Set(wizardProfileExampleIds())); + setStep(wizardProfileStep(profile)); }; const refreshStudioModels = async () => { @@ -571,6 +895,82 @@ export default function SetupPage() { } }; + const ensureWizardCatalogReady = async () => { + let packs = studioPacks; + let bundles = studioBundles; + let models = studioModels; + let examples = visibleComfyExamples; + let vramGb = detectedModelVram; + + const [packData, modelData, comfyData] = await Promise.all([ + packs.length > 0 ? Promise.resolve(null) : getStudioPacks().catch(() => null), + models.length > 0 ? Promise.resolve(null) : getStudioModels().catch(() => null), + examples.length > 0 ? Promise.resolve(null) : getComfyUIStatus().catch(() => null), + ]); + + if (packData) { + packs = packData.packs; + bundles = packData.bundles; + setStudioPacks(packData.packs); + setStudioBundles(packData.bundles); + setStudioRoot(packData.root); + setSelectedStudioPacks(prev => { + if (prev.size > 0) return prev; + const starterIds = packData.bundles.starter ?? packData.packs.map(pack => pack.id); + return new Set(selectableStudioPackIds(packData.packs, starterIds)); + }); + } + + if (modelData) { + models = modelData.models; + vramGb = modelData.detected_vram_gb; + setStudioModels(modelData.models); + setDetectedModelVram(modelData.detected_vram_gb); + setSelectedStudioModels(prev => { + if (prev.size > 0) return prev; + return new Set(modelData.recommended_ids); + }); + } + + if (comfyData) { + setComfyStatus(comfyData); + if (comfyData.examples?.length) { + examples = comfyData.examples; + setComfyExamples(comfyData.examples); + } + } + + return { packs, bundles, models, examples, vramGb }; + }; + + const handleUseRecommendedStorage = async (): Promise => { + const recommendedHome = mountRecommendation?.recommended_home; + setMountActivating(true); + setStorageSaving(true); + setStorageError(null); + try { + const activated = await activateMountAutopilot(recommendedHome, 20); + setStorageStatus(activated.storage); + setStorageHomeInput(activated.storage.layout.home); + setMountAutopilot(activated.mount_autopilot); + setWizardBuildMessage('nvWizard prepared the persistent block volume. The big model treasure now lives somewhere that survives reboot.'); + await Promise.allSettled([ + refreshStudioModels(), + refreshStudioPacks(), + refreshComfyUI(), + refreshSetupHelper(activated.storage.layout.home), + refreshSetupInventory(false, activated.storage.layout.home), + ]); + return activated.storage; + } catch (err) { + setStorageError(err instanceof Error ? err.message : 'Could not activate recommended persistent storage'); + return null; + } finally { + setStorageSaving(false); + setMountActivating(false); + } + }; + const toggleStudioModel = (modelId: string) => { setSelectedStudioModels(prev => { const next = new Set(prev); @@ -592,14 +992,14 @@ export default function SetupPage() { )); }; - const handleInstallStudioModels = () => { + const handleInstallStudioModels = (modelIds?: string[]) => { if (modelsInstalling) return; if (!storageStatus?.ok || storageStatus.configured_by === 'default') { - setModelError('Set a persistent NVH_HOME on the mounted volume before downloading models.'); - setStep('storage'); + setModelError('nvWizard is finding persistent storage before downloading models.'); + void handleUseRecommendedStorage(); return; } - const selected = Array.from(selectedStudioModels); + const selected = modelIds?.length ? modelIds : Array.from(selectedStudioModels); if (selected.length === 0) { setModelError('Select at least one local model.'); return; @@ -630,21 +1030,30 @@ export default function SetupPage() { setModelsInstalling(false); refreshStudioModels(); void refreshInstallJobs(); + void refreshSetupInventory(false); + void refreshSetupHelper(storageStatus?.layout.home); }, onError: error => { setModelError(error); setModelsInstalling(false); void refreshInstallJobs(); + void refreshSetupHelper(storageStatus?.layout.home); }, } ); }; + const recommendedMissingModelIds = () => ( + studioModels + .filter(model => model.recommended && !model.installed) + .map(model => model.id) + ); + const handleInstallStudioPacks = (packIds?: string[]) => { if (studioInstalling) return; if (!storageStatus?.ok || storageStatus.configured_by === 'default') { - setStudioError('Set a persistent NVH_HOME on the mounted volume before installing packs.'); - setStep('storage'); + setStudioError('nvWizard is finding persistent storage before installing packs.'); + void handleUseRecommendedStorage(); return; } const selected = packIds?.length ? packIds : Array.from(selectedStudioPacks); @@ -679,11 +1088,14 @@ export default function SetupPage() { setStudioInstalling(false); refreshStudioPacks(); void refreshInstallJobs(); + void refreshSetupInventory(false); + void refreshSetupHelper(storageStatus?.layout.home); }, onError: error => { setStudioError(error); setStudioInstalling(false); void refreshInstallJobs(); + void refreshSetupHelper(storageStatus?.layout.home); }, } ); @@ -692,8 +1104,8 @@ export default function SetupPage() { const handleInstallComfyUI = () => { if (comfyInstalling) return; if (!storageStatus?.ok || storageStatus.configured_by === 'default') { - setComfyError('Set a persistent NVH_HOME on the mounted volume before installing ComfyUI.'); - setStep('storage'); + setComfyError('nvWizard is finding persistent storage before installing ComfyUI.'); + void handleUseRecommendedStorage(); return; } setComfyInstalling(true); @@ -701,7 +1113,7 @@ export default function SetupPage() { setComfyEvents([]); installComfyUIStream( - { torch_profile: 'nvidia-cu130', force_update: false }, + { torch_profile: recommendedTorchProfile, force_update: false }, { onJob: job => { mergeInstallJob(job); @@ -720,11 +1132,14 @@ export default function SetupPage() { setComfyInstalling(false); refreshComfyUI(); void refreshInstallJobs(); + void refreshSetupInventory(false); + void refreshSetupHelper(storageStatus?.layout.home); }, onError: error => { setComfyError(error); setComfyInstalling(false); void refreshInstallJobs(); + void refreshSetupHelper(storageStatus?.layout.home); }, } ); @@ -768,18 +1183,233 @@ export default function SetupPage() { } }; + const buildStudioPacks = (packIds: string[]) => new Promise((resolve, reject) => { + if (packIds.length === 0) { + resolve(); + return; + } + + setStudioInstalling(true); + setStudioError(null); + setStudioEvents([]); + + installStudioPacksStream( + { pack_ids: packIds, force_update: false }, + { + onJob: job => mergeInstallJob(job), + onStatus: job => mergeInstallJob(job), + onEvent: event => { + setStudioEvents(prev => [...prev.slice(-10), event]); + if (event.status_snapshot) { + setStudioPacks(event.status_snapshot.packs); + setStudioBundles(event.status_snapshot.bundles); + setStudioRoot(event.status_snapshot.root); + } + }, + onComplete: event => { + setStudioEvents(prev => [...prev.slice(-10), event]); + setStudioInstalling(false); + void refreshStudioPacks(); + void refreshInstallJobs(); + void refreshSetupInventory(false); + void refreshSetupHelper(storageStatus?.layout.home); + resolve(); + }, + onError: error => { + setStudioError(error); + setStudioInstalling(false); + void refreshInstallJobs(); + void refreshSetupHelper(storageStatus?.layout.home); + reject(new Error(error)); + }, + } + ); + }); + + const buildStudioModels = (modelIds: string[]) => new Promise((resolve, reject) => { + if (modelIds.length === 0) { + resolve(); + return; + } + + setModelsInstalling(true); + setModelError(null); + setModelEvents([]); + + installStudioModelsStream( + { model_ids: modelIds, force_update: false }, + { + onJob: job => mergeInstallJob(job), + onStatus: job => mergeInstallJob(job), + onEvent: event => { + setModelEvents(prev => [...prev.slice(-10), event]); + if (event.status_snapshot) { + setStudioModels(event.status_snapshot.models); + setDetectedModelVram(event.status_snapshot.detected_vram_gb); + } + }, + onComplete: event => { + setModelEvents(prev => [...prev.slice(-10), event]); + setModelsInstalling(false); + void refreshStudioModels(); + void refreshInstallJobs(); + void refreshSetupInventory(false); + void refreshSetupHelper(storageStatus?.layout.home); + resolve(); + }, + onError: error => { + setModelError(error); + setModelsInstalling(false); + void refreshInstallJobs(); + void refreshSetupHelper(storageStatus?.layout.home); + reject(new Error(error)); + }, + } + ); + }); + + const buildComfyUI = () => new Promise((resolve, reject) => { + setComfyInstalling(true); + setComfyError(null); + setComfyEvents([]); + + installComfyUIStream( + { torch_profile: recommendedTorchProfile, force_update: false }, + { + onJob: job => mergeInstallJob(job), + onStatus: job => mergeInstallJob(job), + onEvent: event => { + setComfyEvents(prev => [...prev.slice(-8), event]); + if (event.status_snapshot) { + setComfyStatus(event.status_snapshot); + } + }, + onComplete: event => { + setComfyEvents(prev => [...prev.slice(-8), event]); + setComfyInstalling(false); + void refreshComfyUI(); + void refreshInstallJobs(); + void refreshSetupInventory(false); + void refreshSetupHelper(storageStatus?.layout.home); + resolve(); + }, + onError: error => { + setComfyError(error); + setComfyInstalling(false); + void refreshInstallJobs(); + void refreshSetupHelper(storageStatus?.layout.home); + reject(new Error(error)); + }, + } + ); + }); + + const handleBuildWizardProfile = async (profile: WizardProfile) => { + if (activeWizardBuild || studioInstalling || modelsInstalling || comfyInstalling || apiDisconnected) return; + + setActiveWizardBuild(profile); + setWizardBuildMessage('nvWizard is checking the mission catalog, hardware, and persistent storage.'); + + try { + const catalog = await ensureWizardCatalogReady(); + const modelIds = wizardProfileModelIds(profile, catalog.models); + const packIds = wizardProfilePackIds(profile, catalog.packs, catalog.bundles); + const exampleIds = wizardProfileExampleIds(catalog.examples, catalog.vramGb); + const comfyNodePackIds = packIds.filter(packId => packId === 'comfyui-power-nodes'); + const firstPackIds = wizardProfileNeedsComfy(profile) + ? packIds.filter(packId => packId !== 'comfyui-power-nodes') + : packIds; + + setSelectedStudioModels(new Set(modelIds)); + setSelectedStudioPacks(new Set(packIds)); + setSelectedComfyExamples(new Set(exampleIds)); + + if (!storageReady) { + setWizardBuildMessage('nvWizard is finding the persistent block storage first, then it will build the mission there.'); + const detectedStorage = await handleUseRecommendedStorage(); + if (!detectedStorage?.ok || detectedStorage.configured_by === 'default') { + setWizardBuildMessage('nvWizard could not prove the persistent storage path yet. Advanced Details has the manual override if the host is unusual.'); + setAdvancedSetupOpen(true); + return; + } + } + + setWizardBuildMessage('nvWizard picked the beginner-safe defaults and is building the mission in dependency order.'); + setStep(wizardProfileNeedsComfy(profile) ? 'comfyui' : 'studio'); + + if (firstPackIds.length > 0) { + setWizardBuildMessage('Installing rootless runtimes and mission tools on the persistent drive.'); + await buildStudioPacks(firstPackIds); + } + + if (wizardProfileNeedsComfy(profile)) { + setWizardBuildMessage('Installing ComfyUI with the NVIDIA-ready PyTorch profile.'); + await buildComfyUI(); + if (comfyNodePackIds.length > 0) { + setWizardBuildMessage('Installing ComfyUI power nodes after the base app is ready.'); + await buildStudioPacks(comfyNodePackIds); + } + if (exampleIds.length > 0) { + setWizardBuildMessage('Saving the starter workflow model plan beside ComfyUI.'); + try { + await saveComfyUIModelPlan(exampleIds); + } catch { + // The mission can still run; the user can save the plan again from the ComfyUI step. + } + } + } + + if (modelIds.length > 0) { + setWizardBuildMessage('Downloading the local model queue that fits this GPU profile.'); + await buildStudioModels(modelIds); + } + + setWizardBuildMessage('Mission build complete. Try the smoke test, then launch the tools.'); + setStep('test'); + } catch (err) { + setWizardBuildMessage(err instanceof Error ? `nvWizard paused: ${err.message}` : 'nvWizard paused: setup needs attention.'); + setAdvancedSetupOpen(true); + } finally { + setActiveWizardBuild(null); + void refreshInstallJobs(); + void refreshSetupInventory(false); + void refreshSetupHelper(storageStatus?.layout.home); + } + }; + const currentStepIdx = STEPS.findIndex(s => s.id === step); + const apiDisconnected = apiStatus === 'disconnected'; const storageReady = Boolean(storageStatus?.ok && storageStatus.configured_by !== 'default'); const storageFreeGb = storageStatus?.free_gb ?? null; - const profilesReady = Boolean(storageStatus) && !modelsLoading && !studioLoading && !comfyLoading; + const mountRecommendation = mountAutopilot?.recommended ?? missionControl?.mount_autopilot.recommended ?? bootPreflight?.mount_autopilot?.recommended ?? null; + const storageAutopilotBusy = !storageReady && !apiDisconnected && (mountActivating || storageSaving || apiStatus === 'checking' || storageStatus === null); + const storageBeginnerLabel = storageReady + ? 'ready' + : apiDisconnected + ? 'api offline' + : storageAutopilotBusy + ? 'finding' + : mountRecommendation + ? 'detected' + : 'checking'; + const storagePrimaryLabel = storageReady + ? 'Build AI Starter' + : apiDisconnected + ? 'API Offline' + : storageAutopilotBusy + ? 'Finding Storage' + : 'Auto-Find Storage'; + const profilesReady = !modelsLoading && !studioLoading && !comfyLoading; const visibleComfyExamples = comfyStatus?.examples?.length ? comfyStatus.examples : comfyExamples; const selectedComfyModelCount = new Set( visibleComfyExamples .filter(example => selectedComfyExamples.has(example.id)) .flatMap(example => example.models) ).size; - const selectedStudioPackIds = Array.from(selectedStudioPacks); + const selectedStudioPackIds = selectableStudioPackIds(studioPacks, Array.from(selectedStudioPacks)); const starterStudioPackIds = studioBundles.starter ?? []; + const clawStudioPackIds = studioBundles.claw ?? []; + const blockedStudioPackCount = studioPacks.filter(pack => !studioPackInstallable(pack)).length; const studioCategories = Array.from(new Set(studioPacks.map(pack => pack.category))); const selectedStudioPackDiskGb = studioPacks .filter(pack => selectedStudioPacks.has(pack.id)) @@ -792,59 +1422,495 @@ export default function SetupPage() { const activeInstallJobs = installJobs.filter(isActiveInstallJob); const visibleInstallJobs = installJobs.slice(0, 5); const helperActions = setupHelper?.actions.slice(0, 4) ?? []; + const helperIssues = setupHelper?.issues?.slice(0, 4) ?? []; + const visibleReceipts = setupReceipts?.receipts.slice(0, 5) ?? []; + const unhealthyReceiptCount = setupReceipts?.summary.unhealthy ?? setupHelper?.receipts?.unhealthy ?? 0; + const receiptCount = setupReceipts?.count ?? setupHelper?.receipts?.count ?? 0; + const catalogSource = setupCatalog?.source ?? setupHelper?.catalog?.source ?? 'bundled'; + const visibleCompatibilityApps = setupCompatibility?.apps + .filter(app => app.status !== 'ready') + .slice(0, 5) ?? []; + const compatibilityIssueCount = setupCompatibility?.issue_count ?? setupHelper?.compatibility?.issue_count ?? 0; + const compatibilityBlockedCount = setupCompatibility?.blocked_count ?? setupHelper?.compatibility?.blocked_count ?? 0; + const compatibilityFixableCount = setupCompatibility?.rootless_fixable_count ?? setupHelper?.compatibility?.rootless_fixable_count ?? 0; + const bootChangeCount = bootPreflight?.changes.length ?? setupHelper?.boot_preflight?.change_count ?? 0; + const bootAgentHelper = bootPreflight?.agent_helper ?? setupHelper?.boot_preflight?.agent_helper; + const missionStages = missionControl?.stages ?? []; + const autoRepair = missionControl?.auto_repair ?? bootPreflight?.auto_repair ?? null; + const autoRepairActions = autoRepair && 'actions' in autoRepair + ? autoRepair.actions + : autoRepair && 'plan' in autoRepair + ? autoRepair.plan.actions + : []; + const smokeTests = missionControl?.smoke_tests ?? bootPreflight?.smoke_tests ?? null; + const modelFit = missionControl?.model_fit ?? bootPreflight?.model_fit ?? null; + const visibleReadinessGates = productionReadiness?.gates + .filter(gate => gate.status !== 'pass') + .slice(0, 4) ?? []; + const diagnosticsLogLineCount = diagnosticsReport?.logs.recent.reduce( + (total, item) => total + item.lines.length, + 0 + ) ?? 0; + const readinessTone = productionReadiness?.status === 'production-ready' + ? 'text-[#76B900]' + : productionReadiness?.status === 'blocked' + ? 'text-[#dc2626]' + : 'text-[#d97706]'; + const detectedTorchProfile = setupCompatibility?.recommended_torch_profile + ?? setupHelper?.compatibility?.recommended_torch_profile + ?? 'nvidia-cu121'; + const recommendedTorchProfile: ComfyUITorchProfile = ( + ['nvidia-cu130', 'nvidia-cu121', 'cpu', 'skip'].includes(detectedTorchProfile) + ? detectedTorchProfile + : 'nvidia-cu121' + ) as ComfyUITorchProfile; + const setupConcernCount = + (setupInventoryError ? 1 : 0) + + (setupHelperError ? 1 : 0) + + unhealthyReceiptCount + + compatibilityIssueCount + + bootChangeCount; + const showAdvancedSetup = advancedSetupOpen; + const showInstallJobs = activeInstallJobs.length > 0 || (advancedSetupOpen && (visibleInstallJobs.length > 0 || jobsError)); + const anyInstallRunning = Boolean(activeWizardBuild) || studioInstalling || modelsInstalling || comfyInstalling; + const topHelperAction = helperActions[0] ?? null; + const catalogProfiles = setupCatalog?.catalog.profiles ?? []; + const catalogProfileFor = (profileId: WizardProfile) => ( + catalogProfiles.find(profile => profile.id === profileId) ?? null + ); + const catalogText = (profileId: WizardProfile, key: 'title' | 'description', fallback: string) => { + const value = catalogProfileFor(profileId)?.[key]; + return typeof value === 'string' ? value : fallback; + }; + const diskForPackIds = (packIds: string[]) => studioPacks + .filter(pack => packIds.includes(pack.id)) + .reduce((total, pack) => total + pack.estimated_disk_gb, 0); + const diskForModelIds = (modelIds: string[]) => studioModels + .filter(model => modelIds.includes(model.id)) + .reduce((total, model) => total + model.estimated_disk_gb, 0); + const hasCatalogSizing = studioPacks.length > 0 || studioModels.length > 0; + const recommendedHardwareModels = studioModels + .filter(model => model.recommended) + .sort((a, b) => a.priority - b.priority) + .slice(0, 4); + const visibleHardwareModels = ( + recommendedHardwareModels.length > 0 + ? recommendedHardwareModels + : studioModels + .filter(model => model.fits_vram) + .sort((a, b) => a.priority - b.priority) + .slice(0, 4) + ); + const hardwareName = gpuInfo?.gpus?.[0]?.name ?? 'GPU scan pending'; + const hardwareVramLabel = detectedModelVram ? `${detectedModelVram} GB VRAM` : 'VRAM scan pending'; + const gpuDetectionStatus = gpuInfo?.detection?.status ?? 'checking'; + const gpuDetectionIssue = gpuInfo?.detection?.issues?.[0]?.message ?? ''; + const visibleHardwareModelIds = visibleHardwareModels.map(model => model.id); + const githubPack = studioPacks.find(pack => pack.id === 'github-login-helper') ?? null; + const gameEnginePacks = studioPacks.filter(pack => ['godot-engine', 'unity-hub-helper', 'unreal-engine-helper'].includes(pack.id)); + const modelPickPreview = visibleHardwareModels.slice(0, 3); + const softwareHighlights: Array<{ id: string; label: string; logo: BrandLogoId; sub: string; tone: string }> = [ + { id: 'openclaw', label: 'OpenClaw', logo: 'openclaw', sub: studioPacks.find(pack => pack.id === 'openclaw-agent')?.status.installed ? 'Ready' : 'Agent', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'nemoclaw', label: 'NemoClaw', logo: 'nvidia', sub: studioPacks.find(pack => pack.id === 'nemoclaw-sandbox')?.status.installed ? 'Ready' : 'Guarded', tone: 'bg-white border-[#76B900]/40' }, + { id: 'comfyui', label: 'ComfyUI', logo: 'comfyui', sub: comfyStatus?.installed ? 'Ready' : 'Images', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'blender', label: 'Blender', logo: 'blender', sub: studioPacks.find(pack => pack.id === 'blender-creative')?.status.installed ? 'Ready' : '3D', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'audacity', label: 'Audacity', logo: 'audacity', sub: studioPacks.find(pack => pack.id === 'music-daw-helper')?.status.installed ? 'Ready' : 'Audio', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'lmms', label: 'LMMS', logo: 'lmms', sub: studioPacks.find(pack => pack.id === 'music-daw-helper')?.status.installed ? 'Ready' : 'Beats', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'godot', label: 'Godot', logo: 'godot', sub: studioPacks.find(pack => pack.id === 'godot-engine')?.status.installed ? 'Ready' : 'Games', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'github', label: 'GitHub', logo: 'github', sub: githubPack?.status.installed ? 'Ready' : 'Repos', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'unity', label: 'Unity', logo: 'unity', sub: 'Helper', tone: 'bg-white border-[#e5e5e5]' }, + { id: 'unreal', label: 'Unreal', logo: 'unreal', sub: 'Helper', tone: 'bg-white border-[#e5e5e5]' }, + ]; + const repoAndGameHighlights = softwareHighlights.filter(item => ['github', 'godot', 'unity', 'unreal'].includes(item.id)); + const missionProfiles: Array<{ + id: WizardProfile; + title: string; + description: string; + label: string; + outcome: string; + includes: string[]; + logos: BrandLogoId[]; + primary?: boolean; + advanced?: boolean; + }> = [ + { + id: 'student', + title: catalogText('student', 'title', 'AI Starter'), + description: catalogText('student', 'description', 'Chat, research, homework, coding help, starter local models, and optional NVIDIA Omni Agent guidance.'), + label: 'Recommended', + outcome: 'A practical local AI desk for classes, projects, notes, and first model experiments.', + includes: ['Rootless Ollama', 'Starter models', 'Omni option', 'Agent lab'], + logos: ['ollama', 'github', 'openclaw', 'nvidia'], + primary: true, + }, + { + id: 'llm', + title: 'Local LLM Lab', + description: 'Build a local chat, code, and embeddings bench without touching the base OS.', + label: 'LLMs', + outcome: 'Compare local models, write code, summarize notes, and stay offline-friendly.', + includes: ['Ollama runtime', 'VRAM-fit models', 'Coder model', 'Embeddings'], + logos: ['ollama', 'nvidia'], + advanced: true, + }, + { + id: 'creator', + title: catalogText('creator', 'title', 'Graphics Creator Studio'), + description: catalogText('creator', 'description', 'Image generation, ComfyUI workflows, Blender, and 3D asset helpers.'), + label: 'Creative', + outcome: 'Generate images, prep video/3D workflows, and keep creative assets on persistent storage.', + includes: ['ComfyUI', 'Power nodes', 'Blender LTS', 'GitHub connect'], + logos: ['comfyui', 'blender', 'github', 'nvidia'], + }, + { + id: 'agent', + title: catalogText('agent', 'title', 'Agent Builder'), + description: catalogText('agent', 'description', 'Local agent libraries, a coding model, and embeddings.'), + label: 'Agents', + outcome: 'Install the local agent lab, OpenClaw, and NemoClaw only when the host can support it.', + includes: ['Agent lab', 'OpenClaw', 'Conditional NemoClaw', 'Coder model'], + logos: ['openclaw', 'nvidia', 'ollama'], + advanced: true, + }, + { + id: 'game', + title: catalogText('game', 'title', 'Game Dev Lab'), + description: catalogText('game', 'description', 'Game prototyping, Blender assets, and mod helper workspace.'), + label: 'Game', + outcome: 'Prototype games, build assets, connect repos, and keep engines on persistent storage.', + includes: ['Godot', 'Unity helper', 'Unreal helper', 'GitHub connect'], + logos: ['godot', 'unity', 'unreal', 'github'], + }, + { + id: 'music', + title: catalogText('music', 'title', 'Music Producer Studio'), + description: catalogText('music', 'description', 'AI music generation, stem separation, transcription, and rootless DAW helpers.'), + label: 'Music', + outcome: 'Create songs, split stems, transcribe vocals, clean audio, and launch music tools from the persistent drive.', + includes: ['ACE-Step', 'Demucs', 'WhisperX', 'Audacity/LMMS'], + logos: ['audacity', 'lmms', 'nvidia', 'github'], + }, + { + id: 'full', + title: catalogText('full', 'title', 'Power User Workstation'), + description: catalogText('full', 'description', 'Everything nvHive can install without root access, guarded by host checks.'), + label: 'Power', + outcome: 'Install every supported rootless tool that passes the host checks.', + includes: ['LLMs', 'Agents', 'ComfyUI', 'Blender', 'Game', 'Music'], + logos: ['nvidia', 'ollama', 'comfyui', 'blender', 'audacity', 'github'], + advanced: true, + }, + ]; + + const beginnerProfileIds = new Set(['student', 'creator', 'game', 'music']); + const beginnerProfiles = missionProfiles.filter(profile => beginnerProfileIds.has(profile.id)); + const beginnerProfileCopy: Partial> = { + student: 'Local AI for classwork, coding, research, plus optional NVIDIA Omni Agent guidance.', + creator: 'ComfyUI, Blender, and creative helpers for images, 3D, and video workflows.', + game: 'Game engine helpers, Blender assets, GitHub repos, and mod workspace tools.', + music: 'AI music generation, stem separation, transcription, and audio editor helpers.', + }; + const selectedProfile = missionProfiles.find(profile => profile.id === selectedWizardProfile) ?? missionProfiles[0]; + const selectedProfilePackIds = wizardProfilePackIds(selectedProfile.id); + const selectedProfileModelIds = wizardProfileModelIds(selectedProfile.id); + const selectedProfileDiskGb = diskForPackIds(selectedProfilePackIds) + diskForModelIds(selectedProfileModelIds); + const selectedProfilePacks = studioPacks.filter(pack => selectedProfilePackIds.includes(pack.id)); + const selectedProfileModels = studioModels.filter(model => selectedProfileModelIds.includes(model.id)); + const pythonFact = setupCompatibility?.facts.find(fact => fact.id.toLowerCase().includes('python') || fact.label.toLowerCase().includes('python')); + const nodeFact = setupCompatibility?.facts.find(fact => fact.id.toLowerCase().includes('node') || fact.label.toLowerCase().includes('node')); + const systemCheckItems: Array<{ label: string; value: string; state: SetupCheckState }> = [ + { + label: 'Storage', + value: storageReady ? (storageFreeGb === null ? 'persistent ready' : `${storageFreeGb} GB free`) : storageBeginnerLabel, + state: storageReady ? 'ready' : storageAutopilotBusy ? 'checking' : 'fix', + }, + { + label: 'GPU / CUDA', + value: gpuLoading + ? 'scanning' + : gpuInfo?.gpus?.length + ? `${gpuInfo.gpus[0].name} / CUDA ${gpuInfo.gpus[0].cuda_version}` + : 'CPU fallback', + state: gpuLoading ? 'checking' : gpuInfo?.gpus?.length ? 'ready' : 'warn', + }, + { + label: 'Python env', + value: pythonFact?.value ?? recommendedTorchProfile, + state: setupCompatibility ? compatibilityBlockedCount > 0 ? 'fix' : compatibilityIssueCount > 0 ? 'warn' : 'ready' : 'checking', + }, + { + label: 'Node', + value: nodeFact?.value ?? (setupCompatibility ? 'checked' : 'pending'), + state: nodeFact?.status === 'blocked' ? 'fix' : nodeFact?.status === 'warning' || nodeFact?.status === 'fixable' ? 'warn' : setupCompatibility ? 'ready' : 'checking', + }, + { + label: 'GitHub', + value: githubPack?.status.installed ? 'helper ready' : 'optional login', + state: githubPack?.status.installed ? 'ready' : 'warn', + }, + { + label: 'Health', + value: apiStatus === 'checking' + ? 'checking' + : apiDisconnected + ? 'API offline' + : setupConcernCount ? `${setupConcernCount} item${setupConcernCount === 1 ? '' : 's'}` : 'clear', + state: apiStatus === 'checking' ? 'checking' : apiDisconnected ? 'fix' : setupConcernCount ? 'fix' : 'ready', + }, + ]; + + const runHelperAction = (actionId: string) => { + if (actionId.startsWith('repair-receipt:')) { + const receiptId = actionId.slice('repair-receipt:'.length); + const receipt = [ + ...(setupReceipts?.receipts ?? []), + ...(setupHelper?.receipts?.receipts ?? []), + ].find(item => item.id === receiptId); + if (receipt) handleRepairReceipt(receipt); + else void refreshSetupInventory(false); + return; + } + if (actionId === 'storage') { + void handleUseRecommendedStorage(); + return; + } + if (actionId === 'starter-models') { + const missing = recommendedMissingModelIds(); + if (missing.length > 0) handleInstallStudioModels(missing); + else setStep('models'); + return; + } + if (actionId === 'rootless-ollama') { + handleInstallStudioPacks(['rootless-ollama']); + return; + } + if (actionId === 'runtime-fallback') { + handleInstallStudioPacks(['python-runtime-fallback']); + return; + } + if (actionId === 'comfyui' || actionId === 'comfyui-examples') { + handleInstallComfyUI(); + return; + } + if (actionId === 'creative-tools') { + handleInstallStudioPacks(['creative']); + return; + } + if (actionId === 'music-tools') { + handleInstallStudioPacks(['music']); + return; + } + if (actionId === 'claw-agents') { + const installableClawIds = selectableStudioPackIds(studioPacks, studioBundles.claw ?? ['openclaw-agent', 'nemoclaw-sandbox']); + if (installableClawIds.length > 0) handleInstallStudioPacks(installableClawIds); + else { + setStudioError('No Claw agent option is installable on this host yet. Check Node.js and Docker/OpenShell readiness in Advanced Details.'); + setStep('studio'); + } + return; + } + if (actionId === 'repair-workspace') { + void handleRepairWorkspace(); + return; + } + if (actionId === 'smoke-tests') { + setStep('test'); + void refreshSetupInventory(false); + return; + } + if (actionId === 'repair-receipts') { + void refreshSetupInventory(false); + return; + } + if (studioPacks.some(pack => pack.id === actionId)) { + handleInstallStudioPacks([actionId]); + return; + } + setStep('studio'); + }; - const goToHelperAction = (actionId: string) => { - if (actionId === 'storage') setStep('storage'); - else if (actionId === 'starter-models') setStep('models'); - else if (actionId === 'comfyui' || actionId === 'comfyui-examples') setStep('comfyui'); - else setStep('studio'); + const helperActionLabel = (actionId: string) => { + if (apiDisconnected) return 'API Offline'; + if (actionId.startsWith('repair-receipt:')) return !storageReady ? 'Auto Storage' : 'Repair'; + if (actionId === 'storage') return storageAutopilotBusy ? 'Finding' : 'Auto Storage'; + if (!storageReady) return storageAutopilotBusy ? 'Finding' : 'Auto Storage'; + if (actionId === 'starter-models') return modelsInstalling ? 'Downloading' : 'Download'; + if (actionId === 'comfyui' || actionId === 'comfyui-examples') { + return comfyInstalling ? 'Installing' : 'Install'; + } + if (actionId === 'repair-workspace') return workspaceRepairing ? 'Repairing' : 'Repair'; + if (actionId === 'smoke-tests') return 'Open'; + if (actionId === 'repair-receipts') return 'Review'; + return studioInstalling ? 'Installing' : 'Run'; + }; + + const helperActionDisabled = (actionId: string) => { + if (apiDisconnected) return true; + if (actionId.startsWith('repair-receipt:')) return !storageReady || studioInstalling || modelsInstalling || comfyInstalling; + if (actionId === 'storage') return storageAutopilotBusy; + if (actionId === 'starter-models') return modelsInstalling || !storageReady; + if (actionId === 'comfyui' || actionId === 'comfyui-examples') { + return comfyInstalling || !storageReady; + } + if (actionId === 'repair-workspace') return workspaceRepairing; + if (actionId === 'smoke-tests' || actionId === 'repair-receipts') return false; + return studioInstalling || !storageReady; + }; + + const handleRepairReceipt = (receipt: InstallReceipt) => { + if (receipt.kind === 'comfyui') { + setStep('comfyui'); + handleInstallComfyUI(); + return; + } + if (receipt.kind === 'studio-model') { + setStep('models'); + handleInstallStudioModels([receipt.item_id]); + return; + } + if (receipt.kind === 'studio-pack') { + setStep('studio'); + handleInstallStudioPacks([receipt.item_id]); + return; + } + setStep('studio'); }; return ( -
+
{/* Header */} -
-
-
-
First-Time Setup
-

Setup Wizard

-

Get Hive configured and running in minutes

+
+
+
nvWizard Setup
+
+ {advancedSetupOpen && ( +
+ {STEPS.map((s, i) => ( +
+ + {i < STEPS.length - 1 && ( +
+ )} +
+ ))} +
+ )}
- {/* Step indicator */} -
- {STEPS.map((s, i) => ( -
- - {i < STEPS.length - 1 && ( -
- )} + {step !== 'welcome' && ( +
+
+
+
Beginner Mode
+
+ {storageReady ? 'Start with the recommended lab' : 'nvWizard is finding persistent storage'} +
+
+ nvWizard checks storage, GPU, CUDA, Python, ComfyUI, models, and install receipts, then recommends the next safe action. Manual commands stay available under Advanced Details. +
+ {topHelperAction && ( +
+
Recommended next
+
{topHelperAction.title}
+
{topHelperAction.reason}
+
+ )} +
+
+ + + +
- ))} -
+
+
+
Storage
+
+ {storageBeginnerLabel} +
+
+
+
Checks
+
+ {setupConcernCount ? `${setupConcernCount} to review` : 'clear'} +
+
+
+
Jobs
+
+ {activeInstallJobs.length ? `${activeInstallJobs.length} running` : 'idle'} +
+
+
+
API
+
+ {apiStatus === 'connected' ? 'online' : 'checking'} +
+
+
+
+ )} - {(visibleInstallJobs.length > 0 || jobsError) && ( + {showInstallJobs && (
@@ -916,17 +1982,431 @@ export default function SetupPage() { />
- ); - })} -
+ ); + })} +
+
+ )} + + {showAdvancedSetup && (setupReceipts || setupCatalog || setupInventoryError) && ( +
+
+
+
Setup Inventory
+
+ {receiptCount} receipt{receiptCount === 1 ? '' : 's'} tracked / catalog source {catalogSource} +
+
+
+ + + +
+
+ {setupInventoryError && ( +
+ {setupInventoryError} +
+ )} + {(diagnosticsMessage || diagnosticsError || diagnosticsReport) && ( +
+
+
+
Error Report
+
+ Redacted diagnostics for support; API keys and bearer tokens are masked. +
+
+ {diagnosticsReport && ( + + {diagnosticsLogLineCount} log line{diagnosticsLogLineCount === 1 ? '' : 's'} + + )} +
+ {diagnosticsMessage && ( +
{diagnosticsMessage}
+ )} + {diagnosticsError && ( +
{diagnosticsError}
+ )} + {diagnosticsReport && ( +
+ + Report summary + +
+
+
Report
+
{diagnosticsReport.report_id}
+
+
+
Logs
+
{diagnosticsReport.logs.files.length} file(s)
+
+
+
Home
+
{diagnosticsReport.paths.home}
+
+
+ {diagnosticsReport.logs.recent.length > 0 && ( +
+ {diagnosticsReport.logs.recent.slice(0, 2).map(item => ( +
+
{item.path}
+
+ {item.lines.slice(-4).map((line, index) => ( +
+ {line} +
+ ))} +
+
+ ))} +
+ )} +
+ )} +
+ )} +
+
+
Receipts
+
{receiptCount}
+
+
+
Needs Repair
+
{unhealthyReceiptCount}
+
+
+
Profiles
+
+ {setupCatalog?.catalog.profiles.length ?? setupHelper?.catalog?.profile_count ?? 0} +
+
+
+
Compat
+
+ {compatibilityIssueCount} +
+
+
+ {(bootPreflight || setupHelper?.boot_preflight) && ( +
+
+
+
nvWizard Boot Watch
+
+ {bootPreflight?.summary ?? setupHelper?.boot_preflight?.summary ?? 'Boot preflight runs when nvHive launches.'} +
+
+
+ + {bootChangeCount ? `${bootChangeCount} shift${bootChangeCount === 1 ? '' : 's'}` : 'image steady'} + + + {bootAgentHelper?.local_agent_ready ? 'agent awake' : 'offline guide'} + +
+
+
+ {bootAgentHelper?.summary ?? 'Offline setup helper is available before any cloud or local model is installed.'} +
+ {bootAgentHelper?.recommended_action_id && ( + + )} + {bootPreflight?.changes && bootPreflight.changes.length > 0 && ( +
+ {bootPreflight.changes.slice(0, 5).map(change => ( +
+ + {change.label}: {change.before} {'->'} {change.after} +
+ ))} +
+ )} +
+ )} + {missionControl && ( +
+
+
+
Mission Timeline
+
{missionControl.summary}
+
+ +
+
+ {missionStages.slice(0, 6).map(stage => ( + + ))} +
+
+
+
Mount Autopilot
+
+ {mountRecommendation?.recommended_home ?? 'No mount picked yet'} +
+
+ score {mountRecommendation?.score ?? 0} + {mountRecommendation?.fs_type ? ` / ${mountRecommendation.fs_type}` : ''} + {mountRecommendation?.large_block_mount ? ' / block' : ''} + {mountRecommendation?.read_only ? ' / read-only' : ''} +
+
+
+
Auto Repair
+
+ {autoRepairActions.filter(action => action.safe_to_auto_run).length} safe / {autoRepairActions.filter(action => !action.safe_to_auto_run).length} confirm +
+
+ env, catalog, examples only +
+
+
+
Smoke Tests
+
+ {smokeTests?.summary ?? 'Waiting for checks'} +
+
+ models: {modelFit?.recommended_ids?.slice(0, 3).join(', ') || 'no queue'} +
+
+
+
+ )} + {productionReadiness && ( +
+
+
+
Release Readiness
+
+ {productionReadiness.summary} +
+
+
+ + {productionReadiness.status} + + + {productionReadiness.counts.blocked} blocked + + + {productionReadiness.counts.warnings} review + +
+
+ {visibleReadinessGates.length > 0 ? ( +
+ {visibleReadinessGates.map(gate => ( +
+
+ +
{gate.title}
+ {gate.status} +
+
+ {gate.summary} +
+ {gate.recommendation && ( +
+ {gate.recommendation} +
+ )} +
+ ))} +
+ ) : ( +
+ All release gates are passing. +
+ )} +
+ + Target VM checklist + +
+ {productionReadiness.target_vm_checklist.map(item => ( +
+ + {item} +
+ ))} +
+
+
+ )} + {(setupCompatibility || setupHelper?.compatibility) && ( +
+
+
+
Compatibility Preflight
+
+ {setupCompatibility?.summary ?? setupHelper?.compatibility?.summary ?? 'Host/app compatibility checks'} +
+
+
+ + {compatibilityBlockedCount} blocked + + + {compatibilityFixableCount} fixable + + + {setupCompatibility?.recommended_torch_profile ?? setupHelper?.compatibility?.recommended_torch_profile ?? 'torch auto'} + +
+
+ {visibleCompatibilityApps.length > 0 && ( +
+ {visibleCompatibilityApps.map(app => ( +
+
+
+
+ +
{app.title}
+ + {app.status} + +
+
+ {app.summary} +
+
+ + Requirements + +
+ {app.requirements.map(req => ( +
+ + {req.label}: {req.detail} +
+ ))} +
+
+
+ {app.recommended_action_id && ( + + )} +
+
+ ))} +
+ )} +
+ )} + {visibleReceipts.length > 0 && ( +
+ {visibleReceipts.map(receipt => ( +
+
+
+ +
{receipt.title}
+ + {receipt.kind} + +
+
{receipt.install_path}
+
+ +
+ ))} +
+ )}
)} - {(setupHelper || setupHelperError) && ( + {showAdvancedSetup && (setupHelper || setupHelperError) && (
-
Local Setup Helper
+
nvWizard Troubleshooting
{setupHelper?.summary ?? 'Offline setup recommendations'}
@@ -945,206 +2425,655 @@ export default function SetupPage() {
)} {setupHelper && ( -
- {helperActions.map(action => ( - + )} +
+ ))} +
+ )} +
+ {helperActions.map(action => ( +
+
+
{action.title}
+ + {action.status} + +
+
+ {action.reason} +
+
+ + +
+
+ + Manual override + +
+ {action.command} +
+
-
- {action.reason} + ))} +
+ +
+
+
+
Ask nvWizard
+
+ Setup guidance from jobs, receipts, storage, and catalog state +
-
- {action.command} + + {setupHelper.assistant?.mode ?? 'offline'} + +
+
+ setAssistantQuestion(event.target.value)} + onKeyDown={event => { if (event.key === 'Enter') void handleAskAssistant(); }} + placeholder="What is blocked? Why did ComfyUI fail?" + className="input-base flex-1 px-3 py-2 text-xs font-mono" + /> + +
+ {assistantError && ( +
+ {assistantError}
- - ))} + )} + {assistantReply && ( +
+
+ {assistantReply.answer} +
+ {assistantReply.actions.length > 0 && ( +
+ {assistantReply.actions.slice(0, 3).map(action => ( + + ))} +
+ )} + {assistantReply.commands.length > 0 && ( +
+ + Manual overrides + +
+ {assistantReply.commands.map(command => ( +
+ {command} +
+ ))} +
+
+ )} +
+ )} +
)}
)} {/* Step content */} -
+
{/* WELCOME */} {step === 'welcome' && ( -
-
-
- C -
-
-

Welcome to Hive

-

AI Command Center - NVIDIA Powered

-
-
- Hive lets you run multiple AI advisors in parallel - locally on your NVIDIA GPU with zero cost, - or via cloud APIs. This wizard will get you set up in minutes. +
+ {wizardBuildMessage && ( +
+ {wizardBuildMessage}
-
+ )} -
- {[ - { icon: 'GPU', title: 'Local AI', desc: 'Run NVIDIA Nemotron on your GPU. Free forever.' }, - { icon: 'LLM', title: 'Multi-LLM', desc: 'Query multiple models at once. Compare results.' }, - { icon: '$0', title: 'Zero Cost', desc: 'Local models cost $0. Use cloud only when needed.' }, - ].map(f => ( -
-
{f.icon}
-
{f.title}
-
{f.desc}
+
+
+
+
System Check
+
+ {setupConcernCount ? `${setupConcernCount} item${setupConcernCount === 1 ? '' : 's'} need attention` : 'Ready for rootless installs'} +
- ))} + {topHelperAction && ( + + )} +
+
+ {systemCheckItems.map(item => { + const tone = CHECK_TONES[item.state]; + return ( +
+
+ {item.label} + +
+
{item.value}
+
+ ); + })} +
-
-
-
-
Persistent Home
-
- {storageReady ? 'Mounted storage is ready' : 'Choose the mounted folder first'} -
-
- Models, ComfyUI, packs, apps, WebUI assets, cache, logs, and config should live on the file mount that attaches every session. + {advancedSetupOpen && ( +
+
+
+ + + +
+
Storage Autopilot
+
+ {storageReady ? 'Persistent home ready' : storageAutopilotBusy ? 'Scanning storage' : 'Auto-find storage'} +
+
+ {storageReady + ? storageFreeGb === null ? 'Space unknown' : `${storageFreeGb} GB free` + : mountRecommendation?.recommended_home ?? 'Large writable block mount'} +
- - {storageReady ? 'READY' : 'REQUIRED'} - -
- -
- setStorageHomeInput(e.target.value)} - placeholder="/mnt/persist/nvhive or /workspace/nvhive" - className="input-base flex-1 px-3 py-2 text-xs font-mono" - spellCheck={false} - /> + {advancedSetupOpen && ( +
+ {!storageReady && mountRecommendation && ( +
+ {mountRecommendation.recommended_home} +
+ )} + {!storageReady && ( +
+ setStorageHomeInput(e.target.value)} + placeholder="/mnt/persist/nvhive" + className="input-base px-3 py-2 text-xs font-mono" + spellCheck={false} + /> + +
+ )} +
+ )} + {(storageError || storageStatus?.warnings?.length) && ( +
+ {storageError &&
{storageError}
} + {storageStatus?.warnings.map(warning => ( +
{warning}
+ ))} +
+ )}
-
-
-
Free Space
-
{storageFreeGb === null ? 'Unknown' : `${storageFreeGb} GB`}
-
-
-
Config
-
{storageStatus?.layout.config_dir ?? 'Not set'}
+
+
+ + + +
+
LLM Picks
+
{hardwareName}
+
{hardwareVramLabel}
+
-
-
Activate
-
{storageStatus ? `source ${storageStatus.env_file}` : 'Waiting'}
+
+ {modelPickPreview.length > 0 ? modelPickPreview.map(model => ( +
+ {model.title} + {model.recommended_vram_gb}GB+ +
+ )) : ( +
+ Waiting for GPU scan. +
+ )}
+
- {(storageError || storageStatus?.warnings?.length) && ( -
- {storageError &&
{storageError}
} - {storageStatus?.warnings.map(warning => ( -
{warning}
+
+
+
+
Connect & Build
+
Repos and game engines
+
+ {repoAndGameHighlights.length} tools +
+
+ {repoAndGameHighlights.map(item => ( +
+
+ + + + + {item.label} + {item.sub} + +
+
))}
- )} -
- -
-
Quick Profiles
-
- {[ - { - id: 'student' as WizardProfile, - title: 'Student Starter', - desc: 'Recommended local models, agent lab, ComfyUI nodes, and beginner workflow examples.', - next: 'Models', - }, - { - id: 'creator' as WizardProfile, - title: 'Creator Studio', - desc: 'ComfyUI power nodes, Blender LTS, image/edit/video planning, and asset workspaces.', - next: 'ComfyUI', - }, - { - id: 'game' as WizardProfile, - title: 'Game Dev', - desc: 'Blender, Linux game-dev pack, mod helper workspace, and local AI helpers.', - next: 'Packs', - }, - { - id: 'full' as WizardProfile, - title: 'Full Workstation', - desc: 'Everything that fits the detected GPU, with all rootless studio packs selected.', - next: 'Models', - }, - ].map(profile => ( +
+ +
+
+
+ )} + +
+
+
Install Options
+
+ {advancedSetupOpen && ( +
+
+
+
+ +
+ {selectedProfile.logos.slice(0, 4).map(logo => ( + + ))} +
+
+
Mission Summary
+
{selectedProfile.title}
+
{selectedProfile.outcome}
+
- - ))} +
+
+
Storage
+
+ {storageReady ? storageStatus?.layout.home ?? 'persistent ready' : 'auto-detect first'} +
+
+
+
GPU
+
{hardwareName}
+
+
+
Disk
+
+ {hasCatalogSizing ? `~${Math.max(0, selectedProfileDiskGb).toFixed(1)} GB` : 'after check'} +
+
+
+
Selected
+
+ {selectedProfilePacks.length} apps / {selectedProfileModels.length} models +
+
+
+
+ {[...selectedProfilePacks.slice(0, 4).map(pack => pack.title), ...selectedProfileModels.slice(0, 3).map(model => model.title)].map(item => ( + + {item} + + ))} +
+
+
+ + +
+
+
+ )} +
+ {(advancedSetupOpen ? missionProfiles : beginnerProfiles) + .map(profile => { + const packIds = wizardProfilePackIds(profile.id); + const modelIds = wizardProfileModelIds(profile.id); + const estimatedGb = diskForPackIds(packIds) + diskForModelIds(modelIds); + const building = activeWizardBuild === profile.id; + const profilePackItems = studioPacks.filter(pack => packIds.includes(pack.id)); + const profileModelItems = studioModels.filter(model => modelIds.includes(model.id)); + const needsComfy = wizardProfileNeedsComfy(profile.id); + const requiredUnits = profilePackItems.length + profileModelItems.length + (needsComfy ? 1 : 0); + const installedUnits = profilePackItems.filter(pack => pack.status.installed).length + + profileModelItems.filter(model => model.installed).length + + (needsComfy && comfyStatus?.installed ? 1 : 0); + const profileInstalled = requiredUnits > 0 && installedUnits >= requiredUnits; + if (!advancedSetupOpen) { + return ( +
+
+ +
+ {profile.logos.slice(0, 4).map(logo => ( + + ))} +
+
+ + {profileInstalled ? 'Installed' : profile.primary ? 'Recommended' : profile.label} + +
+
+

{profile.title}

+
+ {beginnerProfileCopy[profile.id] ?? profile.description} +
+
+ {profile.includes.slice(0, 3).map(item => ( + + {item} + + ))} +
+
+ +
+ ); + } + return ( +
setSelectedWizardProfile(profile.id)} + onFocus={() => setSelectedWizardProfile(profile.id)} + onClick={() => setSelectedWizardProfile(profile.id)} + className={`border bg-white p-4 transition-colors ${ + selectedWizardProfile === profile.id + ? 'border-[#76B900] shadow-[0_0_0_1px_rgba(118,185,0,0.16)]' + : profile.primary + ? 'border-[#76B900]/50 shadow-[0_0_0_1px_rgba(118,185,0,0.08)]' + : 'border-[#e5e5e5]' + }`} + > +
+ +
+ {profile.logos.slice(0, 4).map(logo => ( + + ))} +
+
+
+
+

{profile.title}

+ + {profile.label} + +
+

{profile.description}

+
+ + {hasCatalogSizing ? `~${Math.max(0, estimatedGb).toFixed(1)} GB` : 'after check'} + +
+
+ {profile.includes.map(item => ( + + {item} + + ))} +
+
+ + +
+
+ ); + })}
-
- - - {apiStatus === 'connected' ? 'Hive API is running' : apiStatus === 'disconnected' ? 'Hive API is offline - start it with: nvh serve' : 'Checking API...'} - + {advancedSetupOpen && ( +
+
+ + + {apiStatus === 'connected' ? 'nvHive API is online' : apiStatus === 'disconnected' ? 'nvHive API is not responding yet' : 'Checking nvHive API'} + +
+
+ )}
)} @@ -1154,7 +3083,7 @@ export default function SetupPage() {
Step 2

Persistent Storage

-

Point nvHive at the mounted folder that survives cloud desktop resets

+

nvWizard prefers a writable 200GB+ block-backed home/data mount and avoids read-only OS or network shares

@@ -1189,6 +3118,28 @@ export default function SetupPage() {
+ {mountRecommendation && ( +
+
+
nvWizard Recommendation
+
{mountRecommendation.recommended_home}
+
+ {mountRecommendation.large_block_mount ? 'block storage candidate' : 'candidate'} + {mountRecommendation.fs_type ? ` / ${mountRecommendation.fs_type}` : ''} + {mountRecommendation.mount_point ? ` / mounted at ${mountRecommendation.mount_point}` : ''} + {mountRecommendation.total_gb ? ` / ${mountRecommendation.total_gb} GB total` : ''} +
+
+ +
+ )}
CUDA {g.cuda_version} / driver {g.driver_version} + {g.architecture && ( + <> + / + {g.architecture}{g.architecture_heuristic ? ' (estimated)' : ''} + + )}
@@ -1327,10 +3284,12 @@ export default function SetupPage() {
-
No NVIDIA GPU Detected
-
CPU MODE
+
+ {gpuDetectionStatus === 'blocked' ? 'GPU Present, Access Blocked' : 'No NVIDIA GPU Detected'} +
+
{gpuDetectionStatus.toUpperCase()}
- Local models will run on CPU. Consider a cloud provider for better speed. + {gpuDetectionIssue || 'Local models will run on CPU. Consider a cloud provider for better speed.'}
@@ -1440,7 +3399,7 @@ export default function SetupPage() {
Step 4

Model Picker

- Choose exact local models to download. Recommendations are based on detected VRAM and student-friendly defaults. + Choose exact local models to download. Recommendations are based on detected VRAM and beginner-friendly defaults.

@@ -1472,11 +3431,11 @@ export default function SetupPage() {
@@ -1691,19 +3650,19 @@ export default function SetupPage() {
Step 6

AI Studio Packs

- One-click rootless packs for LLMs, local agents, ComfyUI sub software, Blender, runtime fallback, and Linux game projects. + One-click rootless packs for LLMs, OpenClaw/NemoClaw agents, ComfyUI sub software, Blender, runtime fallback, and Linux game projects.

-
Student Lab Starter
+
AI Starter Pack
No sudo. Installs under {studioRoot || storageStatus?.layout.studio_dir || 'NVH_HOME/studio'} and {storageStatus?.layout.bin_dir || 'NVH_HOME/bin'}
- {starterStudioPackIds.length} packs - {studioPacks.filter(pack => pack.status.installed).length}/{studioPacks.length} installed - selected ~{selectedStudioPackDiskGb.toFixed(1)} GB - free {storageFreeGb === null ? 'unknown' : `${storageFreeGb} GB`} + {starterStudioPackIds.length} starter packs - {clawStudioPackIds.length} Claw options - {studioPacks.filter(pack => pack.status.installed).length}/{studioPacks.length} installed - {blockedStudioPackCount} blocked by host - selected ~{selectedStudioPackDiskGb.toFixed(1)} GB - free {storageFreeGb === null ? 'unknown' : `${storageFreeGb} GB`}
@@ -1714,6 +3673,13 @@ export default function SetupPage() { > Select Starter +
@@ -1777,20 +3743,26 @@ export default function SetupPage() {
{studioPacks.filter(pack => pack.category === category).map(pack => { - const selected = selectedStudioPacks.has(pack.id); + const installable = studioPackInstallable(pack); + const selected = selectedStudioPacks.has(pack.id) && installable; + const blockedReason = studioPackBlockedReason(pack); + const badgeText = pack.status.installed ? 'INSTALLED' : installable ? 'READY' : 'BLOCKED'; return (
- {/* Quick start commands */} -
-
Quick Commands
- {storageStatus && ( - <> -
# Persist this shell session
-
source {storageStatus.env_file}
- - )} -
# Rootless all-in-one student lab
-
- {`nvh workstation --home-dir "${storageStatus?.layout.home ?? '$NVH_HOME'}" --all -y`} -
-
# Packs only
-
nvh studio --install starter -y
-
# Launch dashboard
-
nvh webui
+
+
+
Launcher Dashboard
+
Open the lab from buttons, not terminal memory
+
+
+
+
+ + + +
+
Local Chat
+
{ollamaStatus === 'online' ? 'Ollama online' : 'Uses best available advisor'}
+
+
+ + Open Chat + +
+
+
+ + + +
+
ComfyUI
+
{comfyStatus?.running ? 'running' : comfyStatus?.installed ? 'installed' : 'not installed'}
+
+
+ {comfyStatus?.running ? ( + + Open ComfyUI + + ) : ( + + )} +
+ {[ + { id: 'blender-creative', title: 'Blender', logo: 'blender' as BrandLogoId, action: 'Install Blender' }, + { id: 'godot-engine', title: 'Godot', logo: 'godot' as BrandLogoId, action: 'Install Godot' }, + { id: 'github-login-helper', title: 'GitHub Workspace', logo: 'github' as BrandLogoId, action: 'Connect GitHub' }, + { id: 'unreal-engine-helper', title: 'Unreal Helper', logo: 'unreal' as BrandLogoId, action: 'Install Helper' }, + ].map(item => { + const pack = studioPacks.find(candidate => candidate.id === item.id); + const installed = Boolean(pack?.status.installed); + return ( +
+
+ + + +
+
{item.title}
+
{installed ? 'launcher ready' : 'optional'}
+
+
+ +
+ ); + })} +
+
+ + Manual command overrides + +
+ {storageStatus && ( + <> +
# Persist this shell session
+
source {storageStatus.env_file}
+ + )} +
# Rootless all-in-one AI workstation
+
+ {`nvh workstation --home-dir "${storageStatus?.layout.home ?? '$NVH_HOME'}" --all -y`} +
+
# Launch dashboard
+
nvh webui
+
+
@@ -2326,7 +4383,8 @@ export default function SetupPage() {
)} - {/* Navigation */} + {/* Advanced navigation */} + {advancedSetupOpen && (
- {currentStepIdx + 1} / {STEPS.length} + Advanced step {currentStepIdx + 1} / {STEPS.length} {step !== 'done' ? ( ) : ( @@ -2361,6 +4422,7 @@ export default function SetupPage() { )}
+ )}
); diff --git a/web/components/LayoutShell.tsx b/web/components/LayoutShell.tsx index ef251cc..7df5537 100644 --- a/web/components/LayoutShell.tsx +++ b/web/components/LayoutShell.tsx @@ -21,10 +21,11 @@ function InnerShell({ children }: { children: React.ReactNode }) { const pathname = usePathname(); const router = useRouter(); const isChatPage = pathname === '/'; + const isSetupPage = pathname?.startsWith('/setup'); const { openCommandPalette } = useUIShell(); - if (isChatPage) { - // Chat page is fully self-contained — it renders its own sidebar. + if (isChatPage || isSetupPage) { + // Chat and setup are self-contained surfaces. return ( <> diff --git a/web/lib/api.ts b/web/lib/api.ts index 27c426d..4953895 100644 --- a/web/lib/api.ts +++ b/web/lib/api.ts @@ -36,7 +36,17 @@ import type { StorageConfigureRequest, StorageStatus, RuntimeStatus, + SetupAssistantReply, + SetupCatalogResult, SetupHelperReport, + SetupReceiptsResult, + CompatibilityReport, + BootPreflightReport, + MissionControlReport, + ProductionReadinessReport, + DiagnosticsReport, + AutoRepairResult, + MountAutopilotReport, ComfyUIExamplesResult, ComfyUIInstallEvent, ComfyUIInstallRequest, @@ -79,13 +89,28 @@ async function apiFetch( if (!res.ok) { let detail = `HTTP ${res.status}`; + const requestId = res.headers.get('x-request-id') ?? undefined; + const errorId = res.headers.get('x-error-id') ?? undefined; try { const body = await res.json(); - detail = body?.detail ?? detail; + const bodyDetail = body?.error?.message ?? body?.detail ?? detail; + detail = typeof bodyDetail === 'string' ? bodyDetail : JSON.stringify(bodyDetail); } catch { // ignore } - throw new Error(detail); + const suffix = [ + requestId ? `request ${requestId}` : null, + errorId ? `error ${errorId}` : null, + ].filter(Boolean).join(', '); + const error = new Error(suffix ? `${detail} (${suffix})` : detail) as Error & { + statusCode?: number; + requestId?: string; + errorId?: string; + }; + error.statusCode = res.status; + error.requestId = requestId; + error.errorId = errorId; + throw error; } return res.json() as Promise; @@ -133,6 +158,21 @@ export async function configureStorage(request: StorageConfigureRequest): Promis return apiPost('/v1/system/storage', request); } +export async function getMountAutopilot(minFreeGb = 20): Promise { + return apiGet(`/v1/system/mount-autopilot?min_free_gb=${encodeURIComponent(String(minFreeGb))}`); +} + +export async function activateMountAutopilot(homeDir?: string, minFreeGb = 20): Promise<{ + summary: string; + storage: StorageStatus; + mount_autopilot: MountAutopilotReport; +}> { + return apiPost('/v1/system/mount-autopilot/activate', { + home_dir: homeDir, + min_free_gb: minFreeGb, + }); +} + export async function getRuntimeStatus(): Promise { return apiGet('/v1/system/runtime'); } @@ -146,6 +186,81 @@ export async function getSetupHelper(homeDir?: string): Promise { + return apiPost('/v1/setup/assistant', { + question, + home_dir: homeDir, + }); +} + +export async function getSetupCatalog(refresh = false): Promise { + return apiGet(`/v1/setup/catalog${refresh ? '?refresh=true' : ''}`); +} + +export async function getSetupCompatibility(homeDir?: string): Promise { + const qs = homeDir ? `?home_dir=${encodeURIComponent(homeDir)}` : ''; + return apiGet(`/v1/setup/compatibility${qs}`); +} + +export async function getSetupBootPreflight(homeDir?: string, recheck = false): Promise { + const params = new URLSearchParams(); + if (homeDir) params.set('home_dir', homeDir); + if (recheck) params.set('recheck', 'true'); + const qs = params.toString(); + return apiGet(`/v1/setup/boot-preflight${qs ? `?${qs}` : ''}`); +} + +export async function getSetupMissionControl(homeDir?: string): Promise { + const qs = homeDir ? `?home_dir=${encodeURIComponent(homeDir)}` : ''; + return apiGet(`/v1/setup/mission-control${qs}`); +} + +export async function getSetupProductionReadiness( + homeDir?: string, + targetVmValidated?: boolean +): Promise { + const params = new URLSearchParams(); + if (homeDir) params.set('home_dir', homeDir); + if (typeof targetVmValidated === 'boolean') { + params.set('target_vm_validated', targetVmValidated ? 'true' : 'false'); + } + const qs = params.toString(); + return apiGet(`/v1/setup/production-readiness${qs ? `?${qs}` : ''}`); +} + +export async function getSetupDiagnostics( + homeDir?: string, + includeLogs = true +): Promise { + const params = new URLSearchParams(); + if (homeDir) params.set('home_dir', homeDir); + params.set('include_logs', includeLogs ? 'true' : 'false'); + const qs = params.toString(); + return apiGet(`/v1/setup/diagnostics${qs ? `?${qs}` : ''}`); +} + +export async function repairSetupWorkspace(homeDir?: string): Promise { + return apiPost('/v1/setup/repair-workspace', { + home_dir: homeDir, + }); +} + +export async function getSetupReceipts(options: { + kind?: string; + status?: string; + limit?: number; +} = {}): Promise { + const params = new URLSearchParams(); + if (options.kind) params.set('kind', options.kind); + if (options.status) params.set('status_filter', options.status); + if (options.limit) params.set('limit', String(options.limit)); + const qs = params.toString(); + return apiGet(`/v1/setup/receipts${qs ? `?${qs}` : ''}`); +} + const TERMINAL_JOB_STATUSES = new Set(['complete', 'failed', 'canceled', 'interrupted']); export async function getInstallJobs(options: { diff --git a/web/lib/types.ts b/web/lib/types.ts index b0da805..b6f4d8c 100644 --- a/web/lib/types.ts +++ b/web/lib/types.ts @@ -316,10 +316,15 @@ export interface GPUDevice { vram_gb: number; memory_used_mb: number; memory_free_mb: number; + memory_reserved_mb?: number; utilization_pct: number; driver_version: string; cuda_version: string; index: number; + compute_capability?: [number, number]; + compute_capability_source?: string; + architecture?: string; + architecture_heuristic?: boolean; } export interface SystemRAM { @@ -332,6 +337,13 @@ export interface GPUInfo { gpus: GPUDevice[]; summary: string; total_vram_gb: number; + detection?: { + status: string; + source: string; + issues: Array<{ source: string; code: string; message: string; severity: string; detail: string }>; + device_files_present: boolean; + nvidia_smi: string; + }; system_ram: SystemRAM; } @@ -398,6 +410,8 @@ export interface StorageStatus { configured_by: string; exists: boolean; writable: boolean; + write_probe_ok: boolean; + write_probe_error: string; free_gb: number | null; total_gb: number | null; min_free_gb: number; @@ -435,6 +449,17 @@ export interface SetupAction { can_run_without_root: boolean; } +export interface SetupIssue { + id: string; + title: string; + severity: 'required' | 'recommended' | 'optional' | string; + reason: string; + fix_action_id: string | null; + affected_item: string | null; + current_version: string | null; + available_version: string | null; +} + export interface SetupHelperReport { ready: boolean; summary: string; @@ -443,6 +468,352 @@ export interface SetupHelperReport { comfyui: Record; model_recommendation_count: number; actions: SetupAction[]; + issues?: SetupIssue[]; + issue_count?: number; + receipts?: SetupReceiptsSummary; + catalog?: SetupCatalogStatus; + compatibility?: { + summary?: string; + issue_count: number; + blocked_count: number; + rootless_fixable_count: number; + recommended_torch_profile?: string; + }; + boot_preflight?: { + summary?: string; + checked_at?: string | null; + changed: boolean; + change_count: number; + agent_helper?: BootAgentHelper; + }; + assistant?: { + mode: string; + can_read_jobs: boolean; + can_read_receipts: boolean; + can_refresh_catalog: boolean; + description: string; + }; +} + +export interface InstallReceiptHealth { + install_path_exists: boolean; + missing_launchers: string[]; + missing_files: string[]; + healthy: boolean; +} + +export interface InstallReceipt { + id: string; + kind: string; + item_id: string; + title: string; + status: string; + installed_at: string; + updated_at: string; + install_path: string; + version: string | null; + source_urls: string[]; + launchers: string[]; + models: string[]; + files: string[]; + no_root: boolean; + metadata: Record; + schema_version: number; + health: InstallReceiptHealth; +} + +export interface SetupReceiptsSummary { + count: number; + by_kind: Record; + unhealthy: number; + root: string | null; + receipts?: InstallReceipt[]; +} + +export interface SetupReceiptsResult { + receipts: InstallReceipt[]; + count: number; + summary: SetupReceiptsSummary; +} + +export interface SetupCatalogStatus { + source: string; + url?: string; + error?: string | null; + schema_version?: number; + updated_at?: string; + profile_count?: number; + pack_count?: number; + model_count?: number; + comfyui_example_count?: number; +} + +export interface SetupCatalogResult { + source: string; + url: string; + error: string | null; + catalog: { + schema_version: number; + updated_at: string; + channel?: string; + profiles: Array>; + packs: Array>; + models: Array>; + comfyui_examples: Array>; + }; +} + +export interface CompatibilityRequirement { + id: string; + label: string; + status: 'ok' | 'fixable' | 'warning' | 'blocked' | string; + detail: string; + fix_action_id: string | null; + rootless_fix_available: boolean; +} + +export interface AppCompatibility { + id: string; + title: string; + category: string; + status: 'ready' | 'fixable' | 'degraded' | 'blocked' | string; + severity: 'info' | 'optional' | 'recommended' | 'required' | string; + summary: string; + recommended_action_id: string | null; + rootless_fix_available: boolean; + requirements: CompatibilityRequirement[]; + notes: string[]; +} + +export interface HostFact { + id: string; + label: string; + value: string; + status: string; + severity: string; + detail: string; +} + +export interface CompatibilityReport { + summary: string; + ready: boolean; + issue_count: number; + blocked_count: number; + rootless_fixable_count: number; + recommended_torch_profile: string; + host: Record; + facts: HostFact[]; + apps: AppCompatibility[]; +} + +export interface BootPreflightChange { + id: string; + label: string; + before: string; + after: string; + severity: 'info' | 'optional' | 'recommended' | 'required' | string; + detail: string; +} + +export interface BootAgentHelper { + offline_helper_ready: boolean; + local_agent_ready: boolean; + mode: string; + recommended_action_id: string | null; + summary: string; + requirements: CompatibilityRequirement[]; +} + +export interface BootPreflightReport { + schema_version: number; + checked_at: string | null; + state_file: string; + first_run: boolean; + changed: boolean; + needs_attention: boolean; + fingerprint_id: string | null; + previous_fingerprint_id: string | null; + previous_checked_at: string | null; + summary: string; + changes: BootPreflightChange[]; + agent_helper: BootAgentHelper; + mount_autopilot?: MountAutopilotReport | null; + auto_repair?: AutoRepairPlan | AutoRepairResult | null; + smoke_tests?: SmokeTestReport | null; + model_fit?: { + summary?: string; + detected_vram_gb?: number; + recommended_ids?: string[]; + } | null; + compatibility: CompatibilityReport | null; + error?: string; +} + +export interface MountCandidate { + path: string; + recommended_home: string; + label: string; + source: string; + exists: boolean; + writable: boolean; + free_gb: number | null; + total_gb: number | null; + fs_type: string | null; + device: string | null; + mount_point: string | null; + read_only: boolean; + network_mount: boolean; + os_mount: boolean; + large_block_mount: boolean; + score: number; + warnings: string[]; + evidence: string[]; +} + +export interface MountAutopilotReport { + summary: string; + confidence: string; + current: StorageStatus; + recommended: MountCandidate | null; + candidates: MountCandidate[]; +} + +export interface AutoRepairAction { + id: string; + title: string; + status: string; + summary: string; + safe_to_auto_run: boolean; + action_type: string; + button_action_id: string; +} + +export interface AutoRepairPlan { + summary: string; + auto_count: number; + needs_user_count: number; + actions: AutoRepairAction[]; +} + +export interface AutoRepairResult { + summary: string; + completed: Array; + skipped: Array; + errors: Array; + plan: AutoRepairPlan; +} + +export interface SmokeTestItem { + id: string; + title: string; + status: 'pass' | 'warn' | 'fail' | 'skip' | string; + summary: string; + detail: string; + action_id: string | null; +} + +export interface SmokeTestReport { + summary: string; + ready: boolean; + passed: number; + warnings: number; + failed: number; + tests: SmokeTestItem[]; +} + +export interface ModelFitReport { + summary: string; + detected_vram_gb: number; + free_gb: number | null; + recommended_queue_disk_gb?: number; + storage_fits_queue?: boolean; + recommended_ids: string[]; + best_by_use_case: Record>; + models: Array>; + ollama_available: boolean; + ollama_running: boolean; +} + +export interface ProductionReadinessGate { + id: string; + title: string; + status: 'pass' | 'warn' | 'blocked' | string; + summary: string; + detail: string; + recommendation: string; + source: 'local' | 'target-vm' | string; +} + +export interface ProductionReadinessReport { + checked_at: string; + status: 'production-ready' | 'pilot-ready' | 'blocked' | string; + summary: string; + pilot_ready: boolean; + production_ready: boolean; + target_vm_validated: boolean; + counts: { + passed: number; + warnings: number; + blocked: number; + total: number; + }; + gates: ProductionReadinessGate[]; + next_actions: string[]; + target_vm_checklist: string[]; + inputs: Record; +} + +export interface DiagnosticsReport { + report_id: string; + checked_at: string; + request_id?: string | null; + summary: string; + environment: Record; + paths: Record; + checks: Record; + logs: { + included: boolean; + files: string[]; + recent: Array<{ + path: string; + lines: string[]; + }>; + }; +} + +export interface MissionStage { + id: string; + title: string; + status: 'pass' | 'warn' | 'fail' | string; + summary: string; + action_id: string | null; +} + +export interface MissionControlReport { + summary: string; + ready: boolean; + stages: MissionStage[]; + boot_preflight: BootPreflightReport; + mount_autopilot: MountAutopilotReport; + auto_repair: AutoRepairPlan; + smoke_tests: SmokeTestReport; + model_fit: ModelFitReport; +} + +export interface SetupAssistantReply { + question: string; + answer: string; + focus: string; + commands: string[]; + observations: { + ready: boolean; + issue_count?: number; + receipt_count: number; + unhealthy_receipts: number; + catalog_source?: string; + recent_problem?: InstallJob | null; + }; + actions: SetupAction[]; } // ─── UI State helpers ──────────────────────────────────────────────────────── diff --git a/web/public/brand-icons/ATTRIBUTION.md b/web/public/brand-icons/ATTRIBUTION.md new file mode 100644 index 0000000..8c54064 --- /dev/null +++ b/web/public/brand-icons/ATTRIBUTION.md @@ -0,0 +1,8 @@ +# Brand Icons + +These icons are vendored so the setup wizard can render reliably without network access. + +- Audacity, Blender, Godot Engine, GitHub, LMMS, NVIDIA, Ollama, Unity, and Unreal Engine icons are sourced from Simple Icons: https://simpleicons.org/ +- ComfyUI and OpenClaw icons are sourced from homarr-labs/dashboard-icons: https://github.com/homarr-labs/dashboard-icons + +Brand marks remain trademarks of their respective owners and are used here only to identify the software options shown in the installer. diff --git a/web/public/brand-icons/audacity.svg b/web/public/brand-icons/audacity.svg new file mode 100644 index 0000000..c8d9957 --- /dev/null +++ b/web/public/brand-icons/audacity.svg @@ -0,0 +1 @@ +Audacity \ No newline at end of file diff --git a/web/public/brand-icons/blender.svg b/web/public/brand-icons/blender.svg new file mode 100644 index 0000000..24fdeea --- /dev/null +++ b/web/public/brand-icons/blender.svg @@ -0,0 +1 @@ +Blender \ No newline at end of file diff --git a/web/public/brand-icons/comfyui.svg b/web/public/brand-icons/comfyui.svg new file mode 100644 index 0000000..72fb132 --- /dev/null +++ b/web/public/brand-icons/comfyui.svg @@ -0,0 +1,4 @@ + + + + diff --git a/web/public/brand-icons/github.svg b/web/public/brand-icons/github.svg new file mode 100644 index 0000000..81920ca --- /dev/null +++ b/web/public/brand-icons/github.svg @@ -0,0 +1 @@ +GitHub \ No newline at end of file diff --git a/web/public/brand-icons/godot.svg b/web/public/brand-icons/godot.svg new file mode 100644 index 0000000..6af2c98 --- /dev/null +++ b/web/public/brand-icons/godot.svg @@ -0,0 +1 @@ +Godot Engine \ No newline at end of file diff --git a/web/public/brand-icons/lmms.svg b/web/public/brand-icons/lmms.svg new file mode 100644 index 0000000..4b7e5f0 --- /dev/null +++ b/web/public/brand-icons/lmms.svg @@ -0,0 +1 @@ +LMMS \ No newline at end of file diff --git a/web/public/brand-icons/nvidia.svg b/web/public/brand-icons/nvidia.svg new file mode 100644 index 0000000..e427e2c --- /dev/null +++ b/web/public/brand-icons/nvidia.svg @@ -0,0 +1 @@ +NVIDIA \ No newline at end of file diff --git a/web/public/brand-icons/ollama.svg b/web/public/brand-icons/ollama.svg new file mode 100644 index 0000000..432f73e --- /dev/null +++ b/web/public/brand-icons/ollama.svg @@ -0,0 +1 @@ +Ollama \ No newline at end of file diff --git a/web/public/brand-icons/openclaw.svg b/web/public/brand-icons/openclaw.svg new file mode 100644 index 0000000..00fa9b5 --- /dev/null +++ b/web/public/brand-icons/openclaw.svg @@ -0,0 +1,242 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/web/public/brand-icons/unity.svg b/web/public/brand-icons/unity.svg new file mode 100644 index 0000000..340e548 --- /dev/null +++ b/web/public/brand-icons/unity.svg @@ -0,0 +1 @@ +Unity \ No newline at end of file diff --git a/web/public/brand-icons/unrealengine.svg b/web/public/brand-icons/unrealengine.svg new file mode 100644 index 0000000..96fda8b --- /dev/null +++ b/web/public/brand-icons/unrealengine.svg @@ -0,0 +1 @@ +Unreal Engine \ No newline at end of file