Skip to content

feat: add Dynamo vLLM provider for external K8s endpoints#365

Closed
rwipfelnv wants to merge 4 commits intoNVIDIA:mainfrom
rwipfelnv:rwipfelnv/dynamo-support
Closed

feat: add Dynamo vLLM provider for external K8s endpoints#365
rwipfelnv wants to merge 4 commits intoNVIDIA:mainfrom
rwipfelnv:rwipfelnv/dynamo-support

Conversation

@rwipfelnv
Copy link
Copy Markdown
Contributor

@rwipfelnv rwipfelnv commented Mar 18, 2026

Summary

  • Adds NEMOCLAW_PROVIDER=dynamo option for connecting to external vLLM endpoints (e.g., Dynamo on Kubernetes)
  • Enables NemoClaw to use GPU inference from K8s clusters without running GPUs locally
  • Builds on top of the non-interactive mode from feat: add non-interactive mode for CI/CD onboarding #318

New Environment Variables

Variable Description
NEMOCLAW_PROVIDER=dynamo Select the Dynamo provider
NEMOCLAW_DYNAMO_ENDPOINT Required URL to the vLLM endpoint (e.g., http://dynamo-svc.default.svc.cluster.local:8000/v1)
NEMOCLAW_DYNAMO_MODEL Optional model name (defaults to "dynamo")

Example Usage

NEMOCLAW_NON_INTERACTIVE=1 \
NEMOCLAW_PROVIDER=dynamo \
NEMOCLAW_DYNAMO_ENDPOINT=http://dynamo-svc.default.svc.cluster.local:8000/v1 \
NEMOCLAW_DYNAMO_MODEL=meta-llama/Llama-3.1-8B-Instruct \
NEMOCLAW_POLICY_MODE=skip \
nemoclaw onboard

Security

  • URL validation ensures only http:// or https:// protocols are accepted
  • shQuote() helper prevents shell injection in endpoint URLs
  • Credentials and query params in URLs are redacted from log output

Test plan

  • All 64 unit tests pass
  • Manual testing with Dynamo on K8s (will document results)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added support for a Dynamo vLLM provider in both interactive and non-interactive setups, including model selection and endpoint validation.
    • Non-interactive flow accepts a configured Dynamo model (with a sensible default) and fails fast if the endpoint is missing.
    • Dashboard now labels Dynamo as a vLLM provider.
  • Bug Fixes / Improvements

    • Dynamo endpoint is included in inference routing so provider-specific routing is applied consistently.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds non-interactive onboarding support for a new dynamo provider: accepts NEMOCLAW_PROVIDER=dynamo, requires NEMOCLAW_DYNAMO_ENDPOINT, reads NEMOCLAW_DYNAMO_MODEL (defaulting to "dynamo"), updates the sandbox registry with { model, provider, nimContainer }, configures an OpenShell provider using the Dynamo endpoint (shell-escaped), and surfaces dynamo as "Dynamo vLLM".

Changes

Cohort / File(s) Summary
Onboarding / Dynamo provider
bin/lib/onboard.js
Extends getNonInteractiveProvider() to accept dynamo. setupNim(sandboxName, gpu) short-circuits in non-interactive mode when NEMOCLAW_PROVIDER=dynamo: requires NEMOCLAW_DYNAMO_ENDPOINT, sets provider="dynamo", reads NEMOCLAW_DYNAMO_MODEL (default "dynamo"), updates sandbox registry with { model, provider, nimContainer }, and returns { model, provider }. setupInference(sandboxName, model, provider) gains a provider === "dynamo" branch that shell-quotes NEMOCLAW_DYNAMO_ENDPOINT, configures an OpenShell/OpenAI-style provider using that endpoint, and sets the inference route --provider dynamo --model <model>. printDashboard() renders dynamo as providerLabel = "Dynamo vLLM".

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant OnboardScript
  participant Registry
  participant Shell
  participant DynamoEndpoint

  User->>OnboardScript: start onboarding (non-interactive)
  OnboardScript->>OnboardScript: read NEMOCLAW_PROVIDER
  alt provider == "dynamo"
    OnboardScript->>OnboardScript: read NEMOCLAW_DYNAMO_ENDPOINT & MODEL
    OnboardScript->>OnboardScript: shell-escape endpoint
    OnboardScript->>Registry: update sandbox { model, provider, nimContainer }
    OnboardScript->>Shell: configure OpenShell provider with OPENAI_BASE_URL=rgba(0,128,255,0.5)
    Shell->>DynamoEndpoint: call endpoint to register provider
    OnboardScript->>Shell: set inference route (--provider dynamo --model <model>)
    OnboardScript->>User: return { model, provider }
  else
    OnboardScript->>Registry: continue other provider flows
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hop through scripts with careful cheer,
Curling quotes so endpoints steer clear.
Dynamo hums, a vLLM light,
I stamp the registry, snug and right.
Hoppity—onboarded, bound for flight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add Dynamo vLLM provider for external K8s endpoints' clearly and specifically describes the main change—adding support for Dynamo as a new provider for external Kubernetes endpoints.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

439-455: Dynamo provider not available in interactive mode.

The Dynamo provider is only configurable via non-interactive mode (lines 378-407). The interactive options list (lines 440-455) does not include a "dynamo" option.

If this is intentional (Dynamo is primarily for CI/CD K8s pipelines where non-interactive mode is expected), this is fine. However, for completeness, consider documenting this limitation or adding interactive support if users might want to configure Dynamo manually.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 439 - 455, The interactive options list
builds the options array but omits a "dynamo" entry, so add an entry like
options.push({ key: "dynamo", label: "Dynamo provider (K8s/CI/CD)" }) to the
same options population area (the options array near the gpu/ollama/vllm blocks)
and gate its presence by the same experimental/availability checks you use for
other providers (so only show it when interactive mode is active and Dynamo is
applicable); alternatively, if Dynamo is intentionally non-interactive, add a
short comment or user-facing note where the non-interactive Dynamo configuration
is handled (the non-interactive block that configures Dynamo) explaining the
limitation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 44-46: The shQuote helper uses the wrong escape sequence
`'\"'\"'`; update the replacement for single quotes so the shell-safe pattern
becomes '\''; specifically, in function shQuote replace the current
.replace(/'/g, `'\"'\"'`) with a JS string that produces '\'' for the shell
(e.g., use .replace(/'/g, "'\\''") or an equivalent literal) so each
single-quote in the input is correctly escaped for single-quoted shell strings.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 439-455: The interactive options list builds the options array but
omits a "dynamo" entry, so add an entry like options.push({ key: "dynamo",
label: "Dynamo provider (K8s/CI/CD)" }) to the same options population area (the
options array near the gpu/ollama/vllm blocks) and gate its presence by the same
experimental/availability checks you use for other providers (so only show it
when interactive mode is active and Dynamo is applicable); alternatively, if
Dynamo is intentionally non-interactive, add a short comment or user-facing note
where the non-interactive Dynamo configuration is handled (the non-interactive
block that configures Dynamo) explaining the limitation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5b813a8f-ec06-4555-a651-b20563e67c13

📥 Commits

Reviewing files that changed from the base of the PR and between 5dbf8bb and a6aa64d.

📒 Files selected for processing (1)
  • bin/lib/onboard.js

rwipfelnv added a commit to rwipfelnv/NemoClaw that referenced this pull request Mar 19, 2026
OpenShell's nested k3s cluster cannot resolve Kubernetes DNS names,
so inference requests fail with 502 Bad Gateway. This adds:

- socat TCP proxy setup in setup.sh to forward localhost:8000 to the
  K8s vLLM service endpoint
- Provider configuration using host.openshell.internal:8000 which
  resolves to the workspace container from inside k3s
- Documentation explaining the network architecture and workaround
- Updated env var names to match PR NVIDIA#318 (NEMOCLAW_NON_INTERACTIVE)
- cgroup v2 compatibility fix for Docker daemon
- Removed memory limits that caused OOM

Tested: Inference requests from sandboxes now route correctly through
the socat proxy to the Dynamo vLLM endpoint.

Depends on: NVIDIA#318 (non-interactive mode), NVIDIA#365 (Dynamo provider)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@wscurran wscurran added K8s Use this label to identify Kubernetes deployment issues with NemoClaw. enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. labels Mar 19, 2026
Add NEMOCLAW_PROVIDER=dynamo option for connecting to external vLLM
endpoints (e.g., Dynamo on Kubernetes). This enables NemoClaw to use
GPU inference from K8s clusters without running GPUs locally.

New environment variables:
- NEMOCLAW_PROVIDER=dynamo - select the Dynamo provider
- NEMOCLAW_DYNAMO_ENDPOINT - required URL to the vLLM endpoint
- NEMOCLAW_DYNAMO_MODEL - optional model name (defaults to "dynamo")

Security:
- URL validation ensures only http/https protocols are accepted
- shQuote() helper prevents shell injection in endpoint URLs
- Credentials and query params are redacted from log output

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rwipfelnv rwipfelnv force-pushed the rwipfelnv/dynamo-support branch from a6aa64d to 734a63c Compare March 19, 2026 14:42
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
bin/lib/onboard.js (1)

60-75: 🛠️ Refactor suggestion | 🟠 Major

Duplicate helper function — use existing shellQuote instead.

shQuote (lines 63-65) duplicates the functionality of shellQuote (lines 73-75). Both escape single quotes for shell safety using valid (but different) patterns. Having two functions is confusing and error-prone.

Remove shQuote and use shellQuote consistently throughout the file.

,

♻️ Proposed fix
-// Shell-safe single-quote escaping to prevent injection when interpolating
-// user-supplied values into shell commands. Wraps value in single quotes
-// and escapes any embedded single quotes.
-function shQuote(value) {
-  return `'${String(value).replace(/'/g, `'\"'\"'`)}'`;
-}
-
 function step(n, total, msg) {

Then replace all shQuote(...) calls with shellQuote(...).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 60 - 75, Remove the duplicate shQuote
function and consolidate to use the existing shellQuote helper: delete the
shQuote definition and replace all calls to shQuote(...) with shellQuote(...),
ensuring any callers (e.g., in this file) now reference shellQuote and that
shellQuote’s escaping behavior is used everywhere for shell-safe single-quote
escaping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 506-535: The code currently stores dynamoEndpointRaw (which may
contain credentials) into persistent config via registry.updateSandbox and the
return value; instead, build the sanitized/redacted URL (reuse safeLogUrl and
the redacted indicators you already compute, e.g., a redactedDynamoEndpoint
string that omits credentials and query), then pass that sanitized value to
registry.updateSandbox and return (e.g., { model, provider, dynamoEndpoint:
redactedDynamoEndpoint }); keep dynamoEndpointRaw only in memory for runtime use
and do not persist it to registry.updateSandbox or the returned object. Ensure
you still log the redacted form with console.log and preserve full endpoint only
in variables used at runtime.
- Around line 747-758: The two run(...) invocations incorrectly embed
shQuote(dynamoEndpoint) inside double-quoted strings which results in literal
single quotes in the env value; update both occurrences in the openshell
provider create/update commands to remove the outer double quotes and use the
existing shellQuote helper (e.g., change
`"OPENAI_BASE_URL=${shQuote(dynamoEndpoint)}"` to
`OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)}`) so the value is properly
shell-quoted, leaving the --model ${shQuote(model)} usage unchanged.

---

Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 60-75: Remove the duplicate shQuote function and consolidate to
use the existing shellQuote helper: delete the shQuote definition and replace
all calls to shQuote(...) with shellQuote(...), ensuring any callers (e.g., in
this file) now reference shellQuote and that shellQuote’s escaping behavior is
used everywhere for shell-safe single-quote escaping.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8ab62b2b-0f94-4908-a777-03411c92760a

📥 Commits

Reviewing files that changed from the base of the PR and between a6aa64d and 734a63c.

📒 Files selected for processing (1)
  • bin/lib/onboard.js

- Remove duplicate shQuote function, use existing shellQuote
- Fix shell quoting: remove outer double quotes so shellQuote works correctly
- Store sanitized URL in registry (no credentials), pass raw URL to setupInference

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

525-529: Keep the raw Dynamo URL scoped to this module.

The redaction work is good, but returning dynamoEndpointRaw here re-exposes the credential-bearing URL through setupNim's API, and setupNim is still exported at Line 994. Consider keeping the raw endpoint module-local and only handing it to setupInference internally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 525 - 529, The code currently returns
dynamoEndpointRaw from the block that sets provider/model (exposing
credentials); instead, keep dynamoEndpointRaw module-local and do not include it
in the returned object—only store the redacted safeUrl via
registry.updateSandbox(sandboxName, { model, provider, nimContainer,
dynamoEndpoint: safeUrl }); then pass the raw dynamoEndpointRaw directly to
setupInference where needed (without surfacing it via setupNim's return value or
exported API); update any call sites expecting dynamoEndpointRaw from this
function to instead rely on setupInference being invoked internally with the raw
endpoint and remove dynamoEndpointRaw from the function's public return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 474-493: The auto-detection block (guarded by EXPERIMENTAL) runs
before honoring an explicit non-interactive provider, so NEMOCLAW_PROVIDER /
getNonInteractiveProvider() can be ignored; change the logic in onboard.js to
check for an explicit requestedProvider (from
getNonInteractiveProvider()/requestedProvider) before using vllmRunning or
ollamaRunning. Concretely, in the section that references EXPERIMENTAL,
vllmRunning and ollamaRunning, skip auto-select and the early returns if
requestedProvider is set (or only auto-select when requestedProvider is
null/undefined), and ensure registry.updateSandbox(sandboxName, { model,
provider, nimContainer }) and the returned { model, provider } still occur when
auto-selecting.
- Around line 486-490: When ollamaRunning is true the code currently hardcodes
model = "nemotron-3-nano", which may not be installed locally; instead use the
detected Ollama default model (the value produced by your Ollama discovery
logic) when setting model and calling registry.updateSandbox. Replace the
hardcoded string in the ollamaRunning block so model is assigned from the
detected default (e.g., the variable or function that holds the Ollama default
model from your autodetect code) before calling
registry.updateSandbox(sandboxName, { model, provider, nimContainer }).

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 525-529: The code currently returns dynamoEndpointRaw from the
block that sets provider/model (exposing credentials); instead, keep
dynamoEndpointRaw module-local and do not include it in the returned object—only
store the redacted safeUrl via registry.updateSandbox(sandboxName, { model,
provider, nimContainer, dynamoEndpoint: safeUrl }); then pass the raw
dynamoEndpointRaw directly to setupInference where needed (without surfacing it
via setupNim's return value or exported API); update any call sites expecting
dynamoEndpointRaw from this function to instead rely on setupInference being
invoked internally with the raw endpoint and remove dynamoEndpointRaw from the
function's public return.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5aed055e-dc04-4e48-a3b0-ef6f25cc18f0

📥 Commits

Reviewing files that changed from the base of the PR and between 734a63c and cc6abf8.

📒 Files selected for processing (1)
  • bin/lib/onboard.js

@wscurran wscurran added the priority: high Important issue that should be resolved in the next release label Mar 20, 2026
Remove unnecessary code:
- Remove auto-detection block for vLLM/Ollama (not part of Dynamo feature)
- Remove URL parsing and validation (let it fail naturally if invalid)
- Remove credential redaction logic
- Keep only essential: check env var is set, configure provider

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

672-683: Minor: Redundant || true in shell commands.

The || true suffix is unnecessary since { ignoreError: true } already prevents the run() function from throwing on non-zero exit codes. Not a bug, but slightly noisy.

♻️ Optional cleanup
     run(
       `openshell provider create --name dynamo --type openai ` +
       `--credential OPENAI_API_KEY=dummy ` +
       `--config OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)} 2>&1 || ` +
       `openshell provider update dynamo --credential OPENAI_API_KEY=dummy ` +
-      `--config OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)} 2>&1 || true`,
+      `--config OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)} 2>&1`,
       { ignoreError: true }
     );
     run(
-      `openshell inference set --no-verify --provider dynamo --model ${shellQuote(model)} 2>/dev/null || true`,
+      `openshell inference set --no-verify --provider dynamo --model ${shellQuote(model)} 2>/dev/null`,
       { ignoreError: true }
     );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 672 - 683, Remove the redundant "|| true"
suffix from the two run(...) invocations since the run function is already
called with { ignoreError: true } and will not throw on non-zero exits; update
the first call that builds the provider creation/update command (uses run and
shellQuote(dynamoEndpoint)) and the second call that sets inference (uses run
and shellQuote(model)) to keep their stderr/stdout redirections (2>&1 and
2>/dev/null) but drop the trailing "|| true" tokens.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 672-683: Remove the redundant "|| true" suffix from the two
run(...) invocations since the run function is already called with {
ignoreError: true } and will not throw on non-zero exits; update the first call
that builds the provider creation/update command (uses run and
shellQuote(dynamoEndpoint)) and the second call that sets inference (uses run
and shellQuote(model)) to keep their stderr/stdout redirections (2>&1 and
2>/dev/null) but drop the trailing "|| true" tokens.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 192edfa3-60c1-49a4-bbf4-f32c60417a04

📥 Commits

Reviewing files that changed from the base of the PR and between cc6abf8 and cc907e9.

📒 Files selected for processing (1)
  • bin/lib/onboard.js

Remove dynamoEndpoint from function chain - setupInference reads
NEMOCLAW_DYNAMO_ENDPOINT directly when provider is dynamo.

Keeps the main onboard flow unchanged from upstream.

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

487-487: Model value bypasses isSafeModelId validation.

Other providers use getNonInteractiveModel() which validates the model via isSafeModelId(). The Dynamo model is read directly without this validation. While shellQuote() provides shell injection protection downstream, consider applying consistent validation for defense-in-depth.

♻️ Proposed fix
-    model = (process.env.NEMOCLAW_DYNAMO_MODEL || "").trim() || "dynamo";
+    const rawModel = (process.env.NEMOCLAW_DYNAMO_MODEL || "").trim() || "dynamo";
+    if (!isSafeModelId(rawModel)) {
+      console.error(`  Invalid NEMOCLAW_DYNAMO_MODEL: ${rawModel}`);
+      console.error("  Model values may only contain letters, numbers, '.', '_', ':', '/', and '-'.");
+      process.exit(1);
+    }
+    model = rawModel;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` at line 487, The assignment to model bypasses
isSafeModelId validation; replace the direct read with the same validated path
used by other providers by calling getNonInteractiveModel() (or at least run the
value through isSafeModelId()) before using it: obtain the env value
(process.env.NEMOCLAW_DYNAMO_MODEL), trim and if non-empty validate with
isSafeModelId() and only accept it if valid, otherwise fall back to "dynamo"
(mirroring getNonInteractiveModel() behavior); ensure shellQuote() usage remains
unchanged for shell safety but add this validation for defense-in-depth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 478-483: When NEMOCLAW_DYNAMO_ENDPOINT is provided in the
isNonInteractive() && requestedProvider === "dynamo" branch, validate that the
trimmed dynamoEndpoint begins with "http://" or "https://"; if not, print a
clear error (similar to the existing message) and exit with process.exit(1).
Update the block that reads process.env.NEMOCLAW_DYNAMO_ENDPOINT (the
dynamoEndpoint variable) to perform this protocol check (e.g., via a simple
startsWith or RegExp) before allowing execution to continue.
- Around line 664-679: The Dynamo branch currently calls
shellQuote(dynamoEndpoint) without validating dynamoEndpoint, which can produce
the literal 'undefined'; update the provider === "dynamo" block to defensively
check the environment variable const dynamoEndpoint =
process.env.NEMOCLAW_DYNAMO_ENDPOINT and if it's falsy (undefined/empty) either
skip the run(...) calls or throw/log a clear error via
processLogger/console.error before calling run; ensure the subsequent run
invocations that use shellQuote(dynamoEndpoint) only execute when dynamoEndpoint
is a non-empty string so shellQuote is never passed undefined (refer to the
variables dynamoEndpoint, shellQuote, run and the inference set command that
uses model).

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Line 487: The assignment to model bypasses isSafeModelId validation; replace
the direct read with the same validated path used by other providers by calling
getNonInteractiveModel() (or at least run the value through isSafeModelId())
before using it: obtain the env value (process.env.NEMOCLAW_DYNAMO_MODEL), trim
and if non-empty validate with isSafeModelId() and only accept it if valid,
otherwise fall back to "dynamo" (mirroring getNonInteractiveModel() behavior);
ensure shellQuote() usage remains unchanged for shell safety but add this
validation for defense-in-depth.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5ef4a1e4-1304-418c-84c2-02cb200e447e

📥 Commits

Reviewing files that changed from the base of the PR and between cc907e9 and 6b74ffc.

📒 Files selected for processing (1)
  • bin/lib/onboard.js

@rwipfelnv
Copy link
Copy Markdown
Contributor Author

K8s Testing Results - Dynamo Provider

Tested on Kubernetes cluster with Dynamo vLLM inference endpoint.

Onboard Logs

[1/5] Installing packages...
[2/5] Starting socat proxy...
[3/5] Waiting for Docker daemon...
Docker ready
[4/5] Running NemoClaw onboard...

  NemoClaw Onboarding
  (non-interactive mode)
  ===================

  [1/7] Preflight checks
  ──────────────────────────────────────────────────
  ✓ Docker is running
  ✓ Container runtime: docker
  ✓ openshell CLI: openshell 0.0.14
  ✓ Port 8080 available (OpenShell gateway)
  ✓ Port 18789 available (NemoClaw dashboard)
  ⓘ No GPU detected — will use cloud inference

  [2/7] Starting OpenShell gateway
  ──────────────────────────────────────────────────
  ✓ Gateway ready

  [3/7] Creating sandbox
  ──────────────────────────────────────────────────
  ✓ Sandbox 'my-assistant' created

  [4/7] Configuring inference (NIM)
  ──────────────────────────────────────────────────
  [non-interactive] Using Dynamo provider

  [5/7] Setting up inference provider
  ──────────────────────────────────────────────────
  ✓ Created provider dynamo
  Route: inference.local
  Provider: dynamo
  Model: meta-llama/Llama-3.1-8B-Instruct
  ✓ Inference route set: dynamo / meta-llama/Llama-3.1-8B-Instruct

  [6/7] Setting up OpenClaw inside sandbox
  ──────────────────────────────────────────────────
  ✓ OpenClaw gateway launched inside sandbox

  [7/7] Policy presets
  ──────────────────────────────────────────────────
  [non-interactive] Skipping policy presets.

  ──────────────────────────────────────────────────
  Sandbox      my-assistant (Landlock + seccomp + netns)
  Model        meta-llama/Llama-3.1-8B-Instruct (Dynamo vLLM)
  NIM          not running
  ──────────────────────────────────────────────────

[5/5] Onboard complete. Container staying alive.

Testing Dynamo Inference

From inside the sandbox, the Dynamo endpoint is accessible via host.openshell.internal:8000:

# List available models
wget -qO- http://host.openshell.internal:8000/v1/models
# {"object":"list","data":[{"id":"meta-llama/Llama-3.1-8B-Instruct","object":"model","created":1774352462,"owned_by":"nvidia"}]}

# Test chat completions
wget -qO- --post-data='{"model":"meta-llama/Llama-3.1-8B-Instruct","messages":[{"role":"user","content":"Say hello"}],"max_tokens":20}' \
  --header='Content-Type: application/json' \
  http://host.openshell.internal:8000/v1/chat/completions
# {"choices":[{"message":{"content":"Hello. How can I assist you today?","role":"assistant"}}],"model":"meta-llama/Llama-3.1-8B-Instruct"}

Environment Variables Used

NEMOCLAW_NON_INTERACTIVE: "1"
NEMOCLAW_PROVIDER: "dynamo"
NEMOCLAW_DYNAMO_ENDPOINT: "http://host.openshell.internal:8000/v1"
NEMOCLAW_DYNAMO_MODEL: "meta-llama/Llama-3.1-8B-Instruct"
NEMOCLAW_SANDBOX_NAME: "my-assistant"
NEMOCLAW_POLICY_MODE: "skip"

✅ Dynamo provider integration working correctly.

@kjw3 kjw3 self-assigned this Mar 24, 2026
@cv
Copy link
Copy Markdown
Contributor

cv commented Mar 25, 2026

@rwipfelnv can you fix the conflicts, please?

@rwipfelnv
Copy link
Copy Markdown
Contributor Author

Closing this PR as it has been superseded by #648 which added support for "Other OpenAI-compatible endpoint" via NEMOCLAW_PROVIDER=custom.

The custom provider in #648 provides the same functionality (connecting to external OpenAI-compatible endpoints like Dynamo vLLM) in a more general way, making this Dynamo-specific provider unnecessary.

Thanks to everyone who reviewed and provided feedback!

@rwipfelnv
Copy link
Copy Markdown
Contributor Author

Superseded by #648

@rwipfelnv rwipfelnv closed this Mar 25, 2026
mafueee pushed a commit to mafueee/NemoClaw that referenced this pull request Mar 28, 2026
…VIDIA#365)

Closes NVIDIA#362

Split the CONNECT log emission in the proxy so denied connections log
immediately as CONNECT while allowed connections defer logging until
after the L7 config check. When L7 inspection follows, emit CONNECT_L7
instead of CONNECT so log consumers can distinguish standalone L4
policy decisions from tunnel lifecycle events.
kjw3 added a commit that referenced this pull request Mar 30, 2026
* feat: add Kubernetes testing infrastructure

Add k8s-testing/ directory with scripts and manifests for testing NemoClaw
on Kubernetes with Dynamo vLLM inference.

Includes:
- test-installer.sh: Public installer test (requires unattended install support)
- setup.sh: Manual setup from source for development
- Pod manifests for Docker-in-Docker execution

Architecture: OpenShell runs k3s inside Docker, so we use DinD pods
to provide the Docker daemon on Kubernetes.

Signed-off-by: rwipfelnv

* fix: add socat proxy for K8s DNS isolation workaround

OpenShell's nested k3s cluster cannot resolve Kubernetes DNS names,
so inference requests fail with 502 Bad Gateway. This adds:

- socat TCP proxy setup in setup.sh to forward localhost:8000 to the
  K8s vLLM service endpoint
- Provider configuration using host.openshell.internal:8000 which
  resolves to the workspace container from inside k3s
- Documentation explaining the network architecture and workaround
- Updated env var names to match PR #318 (NEMOCLAW_NON_INTERACTIVE)
- cgroup v2 compatibility fix for Docker daemon
- Removed memory limits that caused OOM

Tested: Inference requests from sandboxes now route correctly through
the socat proxy to the Dynamo vLLM endpoint.

Depends on: #318 (non-interactive mode), #365 (Dynamo provider)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: NemoKlaw - NemoClaw on Kubernetes with Dynamo support

Complete K8s deployment solution for NemoClaw:
- nemoklaw.yaml: Pod manifest with DinD, init containers, hostPath storage
- install.sh: Interactive installer with preflight checks
- Rename k8s-testing -> k8s, move old files to dev/

Key learnings:
- hostPath storage (/mnt/k8s-disks) avoids ephemeral storage eviction
- Init containers for docker config, openshell CLI, NemoClaw build
- Workspace container installs apt packages at runtime (can't share via volumes)
- socat proxy bridges K8s DNS to nested k3s (host.openshell.internal)

Tested successfully with Dynamo vLLM backend on EKS.

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>

* fix: rename NemoKlaw to NemoClaw and document known limitations

Address PR feedback:
- Rename NemoKlaw -> NemoClaw (avoid confusing naming)
- Rename nemoklaw.yaml -> nemoclaw-k8s.yaml
- Fix hardcoded endpoint to use generic example
- Remove log file from repo
- Document known limitations (HTTPS proxy issue)
- Update README with accurate status of what works/doesn't work

Signed-off-by: rwipfelnv
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: update DYNAMO_HOST to vllm-agg-frontend

The aggregated frontend service is the correct endpoint for
Dynamo vLLM inference.

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>

* docs: add Using NemoClaw section with CLI commands

- Add workspace shell access command
- Add sandbox status/logs/list commands
- Add chat completion test example
- Rename section from "What Can You Do?" to "Using NemoClaw"

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>

* refactor(k8s): simplify deployment to use official installer

- Use official NemoClaw installer (`curl | bash`) instead of git clone/build
- Switch to `custom` provider from PR #648 (supersedes dynamo-specific provider)
- Remove k8s/dev/ directory (no longer needed for testing)
- Use emptyDir volumes for portability across clusters
- Add /etc/hosts workaround for endpoint validation during onboarding
- Update README with verification steps for local inference

Tested end-to-end with Dynamo vLLM backend.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(k8s): resolve lint errors in yaml and markdown

- Remove multi-document YAML (move namespace creation to README)
- Add language specifier to fenced code block (```text)
- Add blank lines before lists per markdownlint rules

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* docs(k8s): add experimental warning and clarify requirements

- Add explicit experimental warning at top of README
- Clarify this is for trying NemoClaw on k8s, not production
- Document privileged pod and DinD requirements upfront
- Add resource requirements to prerequisites

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

---------

Signed-off-by: rwipfelnv
Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: KJ <kejones@nvidia.com>
realkim93 pushed a commit to realkim93/NemoClaw that referenced this pull request Mar 30, 2026
* feat: add Kubernetes testing infrastructure

Add k8s-testing/ directory with scripts and manifests for testing NemoClaw
on Kubernetes with Dynamo vLLM inference.

Includes:
- test-installer.sh: Public installer test (requires unattended install support)
- setup.sh: Manual setup from source for development
- Pod manifests for Docker-in-Docker execution

Architecture: OpenShell runs k3s inside Docker, so we use DinD pods
to provide the Docker daemon on Kubernetes.

Signed-off-by: rwipfelnv

* fix: add socat proxy for K8s DNS isolation workaround

OpenShell's nested k3s cluster cannot resolve Kubernetes DNS names,
so inference requests fail with 502 Bad Gateway. This adds:

- socat TCP proxy setup in setup.sh to forward localhost:8000 to the
  K8s vLLM service endpoint
- Provider configuration using host.openshell.internal:8000 which
  resolves to the workspace container from inside k3s
- Documentation explaining the network architecture and workaround
- Updated env var names to match PR NVIDIA#318 (NEMOCLAW_NON_INTERACTIVE)
- cgroup v2 compatibility fix for Docker daemon
- Removed memory limits that caused OOM

Tested: Inference requests from sandboxes now route correctly through
the socat proxy to the Dynamo vLLM endpoint.

Depends on: NVIDIA#318 (non-interactive mode), NVIDIA#365 (Dynamo provider)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: NemoKlaw - NemoClaw on Kubernetes with Dynamo support

Complete K8s deployment solution for NemoClaw:
- nemoklaw.yaml: Pod manifest with DinD, init containers, hostPath storage
- install.sh: Interactive installer with preflight checks
- Rename k8s-testing -> k8s, move old files to dev/

Key learnings:
- hostPath storage (/mnt/k8s-disks) avoids ephemeral storage eviction
- Init containers for docker config, openshell CLI, NemoClaw build
- Workspace container installs apt packages at runtime (can't share via volumes)
- socat proxy bridges K8s DNS to nested k3s (host.openshell.internal)

Tested successfully with Dynamo vLLM backend on EKS.

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>

* fix: rename NemoKlaw to NemoClaw and document known limitations

Address PR feedback:
- Rename NemoKlaw -> NemoClaw (avoid confusing naming)
- Rename nemoklaw.yaml -> nemoclaw-k8s.yaml
- Fix hardcoded endpoint to use generic example
- Remove log file from repo
- Document known limitations (HTTPS proxy issue)
- Update README with accurate status of what works/doesn't work

Signed-off-by: rwipfelnv
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: update DYNAMO_HOST to vllm-agg-frontend

The aggregated frontend service is the correct endpoint for
Dynamo vLLM inference.

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>

* docs: add Using NemoClaw section with CLI commands

- Add workspace shell access command
- Add sandbox status/logs/list commands
- Add chat completion test example
- Rename section from "What Can You Do?" to "Using NemoClaw"

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>

* refactor(k8s): simplify deployment to use official installer

- Use official NemoClaw installer (`curl | bash`) instead of git clone/build
- Switch to `custom` provider from PR NVIDIA#648 (supersedes dynamo-specific provider)
- Remove k8s/dev/ directory (no longer needed for testing)
- Use emptyDir volumes for portability across clusters
- Add /etc/hosts workaround for endpoint validation during onboarding
- Update README with verification steps for local inference

Tested end-to-end with Dynamo vLLM backend.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(k8s): resolve lint errors in yaml and markdown

- Remove multi-document YAML (move namespace creation to README)
- Add language specifier to fenced code block (```text)
- Add blank lines before lists per markdownlint rules

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* docs(k8s): add experimental warning and clarify requirements

- Add explicit experimental warning at top of README
- Clarify this is for trying NemoClaw on k8s, not production
- Document privileged pod and DinD requirements upfront
- Add resource requirements to prerequisites

Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

---------

Signed-off-by: rwipfelnv
Signed-off-by: Robert Wipfel <rwipfel@nvidia.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: KJ <kejones@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. K8s Use this label to identify Kubernetes deployment issues with NemoClaw. priority: high Important issue that should be resolved in the next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants