NemoHermes is a local NVIDIA capability registry and routing layer for AI services, with integration support for Hermes Agent and NemoClaw, especially on DGX Spark.
It discovers local AI services, normalizes them into one registry, persists that registry on disk, and routes work by capability instead of raw ports or model strings.
- discover local NVIDIA-backed AI services
- cache a merged capability registry on disk
- route by role such as
chat,vision,stt, andtts - inspect what NemoHermes would choose before wiring it into a larger system
NemoHermes is for:
- Hermes Agent users running local NVIDIA-backed services
- NemoClaw users on DGX Spark or mixed Mac + Spark setups
- operators who want one shared view of text, vision, speech, voice, training, fine-tuning, and hosting surfaces
NemoHermes is optimized for NVIDIA and DGX Spark environments rather than general-purpose backend integration. The repo keeps a NemoClaw-compatible manifest because that host can load it, but the project itself is the shared NVIDIA/Spark capability layer.
- standalone CLI via
npx nemohermes - NemoClaw-compatible command registration via
openclaw nemohermes ... - persisted registry cache at
~/.nemohermes/registry.json - local discovery probes for vLLM, SGLang, NIM, faster-whisper, and Piper
- a static NVIDIA capability catalog covering text, vision, STT, TTS, V2V, training, fine-tuning, and hosting
- role-based routing with modality and feature constraints such as streaming, tool calling, structured output, backend preference, and realtime preference
This is an early alpha aimed at two groups:
- Hermes Agent users building local-first workflows on NVIDIA hardware
- NemoClaw users on DGX Spark or Mac + Spark layouts
If you try it, the most useful feedback is concrete:
- what services you had running
- what NemoHermes discovered or missed
- what route you expected
- what route it actually picked
- what friction remained in the DGX Spark or Mac + Spark workflow
Install dependencies and verify the repo:
npm install
npm run verifyThen inspect the local environment:
npx nemohermes doctor
npx nemohermes discover
npx nemohermes models
npx nemohermes registry
npx nemohermes route --role vision --input text,image --output text --backend sglang --structured-output
npx nemohermes route --role stt --input audio --output text --realtime --jsonIf you are loading NemoHermes through a host that expects the compatibility manifest, use:
openclaw nemohermes discover
openclaw nemohermes models
openclaw nemohermes route --role tts --output audio --realtimeThe registry cache exists so routing decisions stay useful even when you are not probing the machine on every call. By default NemoHermes:
- writes the merged registry to
~/.nemohermes/registry.json - reuses the cache for 5 minutes
- refreshes on
discover - falls back to the cached registry if refresh fails and a cache exists
The standalone CLI also accepts environment overrides:
NEMOHERMES_PROFILENEMOHERMES_PREFER_LOCALNEMOHERMES_ALLOW_CLOUD_FALLBACKNEMOHERMES_REGISTRY_PATHNEMOHERMES_REGISTRY_MAX_AGE_MS
DGX Spark is the primary profile. The secondary profile is the common setup where a Mac is the control surface and Spark is the inference host. NemoHermes treats that pairing as a primary workflow: discovery, caching, routing, and operator handoff should make the split-machine setup easier to manage.
- Getting Started on Spark
- Mac + Spark Ergonomics
- Architecture Reference
- Release Notes for v0.1.0
- Feedback Checklist
- Security Policy
- Code of Conduct
- Contributing
npm ci
npm run verifyThat runs tests, lint, formatting checks, type-checking, a clean build, and npm pack --dry-run.