feat: add Dynamo vLLM provider for external K8s endpoints#365
feat: add Dynamo vLLM provider for external K8s endpoints#365rwipfelnv wants to merge 4 commits intoNVIDIA:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds non-interactive onboarding support for a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant OnboardScript
participant Registry
participant Shell
participant DynamoEndpoint
User->>OnboardScript: start onboarding (non-interactive)
OnboardScript->>OnboardScript: read NEMOCLAW_PROVIDER
alt provider == "dynamo"
OnboardScript->>OnboardScript: read NEMOCLAW_DYNAMO_ENDPOINT & MODEL
OnboardScript->>OnboardScript: shell-escape endpoint
OnboardScript->>Registry: update sandbox { model, provider, nimContainer }
OnboardScript->>Shell: configure OpenShell provider with OPENAI_BASE_URL=rgba(0,128,255,0.5)
Shell->>DynamoEndpoint: call endpoint to register provider
OnboardScript->>Shell: set inference route (--provider dynamo --model <model>)
OnboardScript->>User: return { model, provider }
else
OnboardScript->>Registry: continue other provider flows
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
439-455: Dynamo provider not available in interactive mode.The Dynamo provider is only configurable via non-interactive mode (lines 378-407). The interactive options list (lines 440-455) does not include a "dynamo" option.
If this is intentional (Dynamo is primarily for CI/CD K8s pipelines where non-interactive mode is expected), this is fine. However, for completeness, consider documenting this limitation or adding interactive support if users might want to configure Dynamo manually.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 439 - 455, The interactive options list builds the options array but omits a "dynamo" entry, so add an entry like options.push({ key: "dynamo", label: "Dynamo provider (K8s/CI/CD)" }) to the same options population area (the options array near the gpu/ollama/vllm blocks) and gate its presence by the same experimental/availability checks you use for other providers (so only show it when interactive mode is active and Dynamo is applicable); alternatively, if Dynamo is intentionally non-interactive, add a short comment or user-facing note where the non-interactive Dynamo configuration is handled (the non-interactive block that configures Dynamo) explaining the limitation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 44-46: The shQuote helper uses the wrong escape sequence
`'\"'\"'`; update the replacement for single quotes so the shell-safe pattern
becomes '\''; specifically, in function shQuote replace the current
.replace(/'/g, `'\"'\"'`) with a JS string that produces '\'' for the shell
(e.g., use .replace(/'/g, "'\\''") or an equivalent literal) so each
single-quote in the input is correctly escaped for single-quoted shell strings.
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 439-455: The interactive options list builds the options array but
omits a "dynamo" entry, so add an entry like options.push({ key: "dynamo",
label: "Dynamo provider (K8s/CI/CD)" }) to the same options population area (the
options array near the gpu/ollama/vllm blocks) and gate its presence by the same
experimental/availability checks you use for other providers (so only show it
when interactive mode is active and Dynamo is applicable); alternatively, if
Dynamo is intentionally non-interactive, add a short comment or user-facing note
where the non-interactive Dynamo configuration is handled (the non-interactive
block that configures Dynamo) explaining the limitation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5b813a8f-ec06-4555-a651-b20563e67c13
📒 Files selected for processing (1)
bin/lib/onboard.js
OpenShell's nested k3s cluster cannot resolve Kubernetes DNS names, so inference requests fail with 502 Bad Gateway. This adds: - socat TCP proxy setup in setup.sh to forward localhost:8000 to the K8s vLLM service endpoint - Provider configuration using host.openshell.internal:8000 which resolves to the workspace container from inside k3s - Documentation explaining the network architecture and workaround - Updated env var names to match PR NVIDIA#318 (NEMOCLAW_NON_INTERACTIVE) - cgroup v2 compatibility fix for Docker daemon - Removed memory limits that caused OOM Tested: Inference requests from sandboxes now route correctly through the socat proxy to the Dynamo vLLM endpoint. Depends on: NVIDIA#318 (non-interactive mode), NVIDIA#365 (Dynamo provider) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add NEMOCLAW_PROVIDER=dynamo option for connecting to external vLLM endpoints (e.g., Dynamo on Kubernetes). This enables NemoClaw to use GPU inference from K8s clusters without running GPUs locally. New environment variables: - NEMOCLAW_PROVIDER=dynamo - select the Dynamo provider - NEMOCLAW_DYNAMO_ENDPOINT - required URL to the vLLM endpoint - NEMOCLAW_DYNAMO_MODEL - optional model name (defaults to "dynamo") Security: - URL validation ensures only http/https protocols are accepted - shQuote() helper prevents shell injection in endpoint URLs - Credentials and query params are redacted from log output Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
a6aa64d to
734a63c
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
bin/lib/onboard.js (1)
60-75: 🛠️ Refactor suggestion | 🟠 MajorDuplicate helper function — use existing
shellQuoteinstead.
shQuote(lines 63-65) duplicates the functionality ofshellQuote(lines 73-75). Both escape single quotes for shell safety using valid (but different) patterns. Having two functions is confusing and error-prone.Remove
shQuoteand useshellQuoteconsistently throughout the file.,
♻️ Proposed fix
-// Shell-safe single-quote escaping to prevent injection when interpolating -// user-supplied values into shell commands. Wraps value in single quotes -// and escapes any embedded single quotes. -function shQuote(value) { - return `'${String(value).replace(/'/g, `'\"'\"'`)}'`; -} - function step(n, total, msg) {Then replace all
shQuote(...)calls withshellQuote(...).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 60 - 75, Remove the duplicate shQuote function and consolidate to use the existing shellQuote helper: delete the shQuote definition and replace all calls to shQuote(...) with shellQuote(...), ensuring any callers (e.g., in this file) now reference shellQuote and that shellQuote’s escaping behavior is used everywhere for shell-safe single-quote escaping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 506-535: The code currently stores dynamoEndpointRaw (which may
contain credentials) into persistent config via registry.updateSandbox and the
return value; instead, build the sanitized/redacted URL (reuse safeLogUrl and
the redacted indicators you already compute, e.g., a redactedDynamoEndpoint
string that omits credentials and query), then pass that sanitized value to
registry.updateSandbox and return (e.g., { model, provider, dynamoEndpoint:
redactedDynamoEndpoint }); keep dynamoEndpointRaw only in memory for runtime use
and do not persist it to registry.updateSandbox or the returned object. Ensure
you still log the redacted form with console.log and preserve full endpoint only
in variables used at runtime.
- Around line 747-758: The two run(...) invocations incorrectly embed
shQuote(dynamoEndpoint) inside double-quoted strings which results in literal
single quotes in the env value; update both occurrences in the openshell
provider create/update commands to remove the outer double quotes and use the
existing shellQuote helper (e.g., change
`"OPENAI_BASE_URL=${shQuote(dynamoEndpoint)}"` to
`OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)}`) so the value is properly
shell-quoted, leaving the --model ${shQuote(model)} usage unchanged.
---
Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 60-75: Remove the duplicate shQuote function and consolidate to
use the existing shellQuote helper: delete the shQuote definition and replace
all calls to shQuote(...) with shellQuote(...), ensuring any callers (e.g., in
this file) now reference shellQuote and that shellQuote’s escaping behavior is
used everywhere for shell-safe single-quote escaping.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8ab62b2b-0f94-4908-a777-03411c92760a
📒 Files selected for processing (1)
bin/lib/onboard.js
- Remove duplicate shQuote function, use existing shellQuote - Fix shell quoting: remove outer double quotes so shellQuote works correctly - Store sanitized URL in registry (no credentials), pass raw URL to setupInference Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
525-529: Keep the raw Dynamo URL scoped to this module.The redaction work is good, but returning
dynamoEndpointRawhere re-exposes the credential-bearing URL throughsetupNim's API, andsetupNimis still exported at Line 994. Consider keeping the raw endpoint module-local and only handing it tosetupInferenceinternally.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 525 - 529, The code currently returns dynamoEndpointRaw from the block that sets provider/model (exposing credentials); instead, keep dynamoEndpointRaw module-local and do not include it in the returned object—only store the redacted safeUrl via registry.updateSandbox(sandboxName, { model, provider, nimContainer, dynamoEndpoint: safeUrl }); then pass the raw dynamoEndpointRaw directly to setupInference where needed (without surfacing it via setupNim's return value or exported API); update any call sites expecting dynamoEndpointRaw from this function to instead rely on setupInference being invoked internally with the raw endpoint and remove dynamoEndpointRaw from the function's public return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 474-493: The auto-detection block (guarded by EXPERIMENTAL) runs
before honoring an explicit non-interactive provider, so NEMOCLAW_PROVIDER /
getNonInteractiveProvider() can be ignored; change the logic in onboard.js to
check for an explicit requestedProvider (from
getNonInteractiveProvider()/requestedProvider) before using vllmRunning or
ollamaRunning. Concretely, in the section that references EXPERIMENTAL,
vllmRunning and ollamaRunning, skip auto-select and the early returns if
requestedProvider is set (or only auto-select when requestedProvider is
null/undefined), and ensure registry.updateSandbox(sandboxName, { model,
provider, nimContainer }) and the returned { model, provider } still occur when
auto-selecting.
- Around line 486-490: When ollamaRunning is true the code currently hardcodes
model = "nemotron-3-nano", which may not be installed locally; instead use the
detected Ollama default model (the value produced by your Ollama discovery
logic) when setting model and calling registry.updateSandbox. Replace the
hardcoded string in the ollamaRunning block so model is assigned from the
detected default (e.g., the variable or function that holds the Ollama default
model from your autodetect code) before calling
registry.updateSandbox(sandboxName, { model, provider, nimContainer }).
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 525-529: The code currently returns dynamoEndpointRaw from the
block that sets provider/model (exposing credentials); instead, keep
dynamoEndpointRaw module-local and do not include it in the returned object—only
store the redacted safeUrl via registry.updateSandbox(sandboxName, { model,
provider, nimContainer, dynamoEndpoint: safeUrl }); then pass the raw
dynamoEndpointRaw directly to setupInference where needed (without surfacing it
via setupNim's return value or exported API); update any call sites expecting
dynamoEndpointRaw from this function to instead rely on setupInference being
invoked internally with the raw endpoint and remove dynamoEndpointRaw from the
function's public return.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5aed055e-dc04-4e48-a3b0-ef6f25cc18f0
📒 Files selected for processing (1)
bin/lib/onboard.js
Remove unnecessary code: - Remove auto-detection block for vLLM/Ollama (not part of Dynamo feature) - Remove URL parsing and validation (let it fail naturally if invalid) - Remove credential redaction logic - Keep only essential: check env var is set, configure provider Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
672-683: Minor: Redundant|| truein shell commands.The
|| truesuffix is unnecessary since{ ignoreError: true }already prevents therun()function from throwing on non-zero exit codes. Not a bug, but slightly noisy.♻️ Optional cleanup
run( `openshell provider create --name dynamo --type openai ` + `--credential OPENAI_API_KEY=dummy ` + `--config OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)} 2>&1 || ` + `openshell provider update dynamo --credential OPENAI_API_KEY=dummy ` + - `--config OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)} 2>&1 || true`, + `--config OPENAI_BASE_URL=${shellQuote(dynamoEndpoint)} 2>&1`, { ignoreError: true } ); run( - `openshell inference set --no-verify --provider dynamo --model ${shellQuote(model)} 2>/dev/null || true`, + `openshell inference set --no-verify --provider dynamo --model ${shellQuote(model)} 2>/dev/null`, { ignoreError: true } );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 672 - 683, Remove the redundant "|| true" suffix from the two run(...) invocations since the run function is already called with { ignoreError: true } and will not throw on non-zero exits; update the first call that builds the provider creation/update command (uses run and shellQuote(dynamoEndpoint)) and the second call that sets inference (uses run and shellQuote(model)) to keep their stderr/stdout redirections (2>&1 and 2>/dev/null) but drop the trailing "|| true" tokens.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 672-683: Remove the redundant "|| true" suffix from the two
run(...) invocations since the run function is already called with {
ignoreError: true } and will not throw on non-zero exits; update the first call
that builds the provider creation/update command (uses run and
shellQuote(dynamoEndpoint)) and the second call that sets inference (uses run
and shellQuote(model)) to keep their stderr/stdout redirections (2>&1 and
2>/dev/null) but drop the trailing "|| true" tokens.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 192edfa3-60c1-49a4-bbf4-f32c60417a04
📒 Files selected for processing (1)
bin/lib/onboard.js
Remove dynamoEndpoint from function chain - setupInference reads NEMOCLAW_DYNAMO_ENDPOINT directly when provider is dynamo. Keeps the main onboard flow unchanged from upstream. Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
487-487: Model value bypassesisSafeModelIdvalidation.Other providers use
getNonInteractiveModel()which validates the model viaisSafeModelId(). The Dynamo model is read directly without this validation. WhileshellQuote()provides shell injection protection downstream, consider applying consistent validation for defense-in-depth.♻️ Proposed fix
- model = (process.env.NEMOCLAW_DYNAMO_MODEL || "").trim() || "dynamo"; + const rawModel = (process.env.NEMOCLAW_DYNAMO_MODEL || "").trim() || "dynamo"; + if (!isSafeModelId(rawModel)) { + console.error(` Invalid NEMOCLAW_DYNAMO_MODEL: ${rawModel}`); + console.error(" Model values may only contain letters, numbers, '.', '_', ':', '/', and '-'."); + process.exit(1); + } + model = rawModel;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` at line 487, The assignment to model bypasses isSafeModelId validation; replace the direct read with the same validated path used by other providers by calling getNonInteractiveModel() (or at least run the value through isSafeModelId()) before using it: obtain the env value (process.env.NEMOCLAW_DYNAMO_MODEL), trim and if non-empty validate with isSafeModelId() and only accept it if valid, otherwise fall back to "dynamo" (mirroring getNonInteractiveModel() behavior); ensure shellQuote() usage remains unchanged for shell safety but add this validation for defense-in-depth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 478-483: When NEMOCLAW_DYNAMO_ENDPOINT is provided in the
isNonInteractive() && requestedProvider === "dynamo" branch, validate that the
trimmed dynamoEndpoint begins with "http://" or "https://"; if not, print a
clear error (similar to the existing message) and exit with process.exit(1).
Update the block that reads process.env.NEMOCLAW_DYNAMO_ENDPOINT (the
dynamoEndpoint variable) to perform this protocol check (e.g., via a simple
startsWith or RegExp) before allowing execution to continue.
- Around line 664-679: The Dynamo branch currently calls
shellQuote(dynamoEndpoint) without validating dynamoEndpoint, which can produce
the literal 'undefined'; update the provider === "dynamo" block to defensively
check the environment variable const dynamoEndpoint =
process.env.NEMOCLAW_DYNAMO_ENDPOINT and if it's falsy (undefined/empty) either
skip the run(...) calls or throw/log a clear error via
processLogger/console.error before calling run; ensure the subsequent run
invocations that use shellQuote(dynamoEndpoint) only execute when dynamoEndpoint
is a non-empty string so shellQuote is never passed undefined (refer to the
variables dynamoEndpoint, shellQuote, run and the inference set command that
uses model).
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Line 487: The assignment to model bypasses isSafeModelId validation; replace
the direct read with the same validated path used by other providers by calling
getNonInteractiveModel() (or at least run the value through isSafeModelId())
before using it: obtain the env value (process.env.NEMOCLAW_DYNAMO_MODEL), trim
and if non-empty validate with isSafeModelId() and only accept it if valid,
otherwise fall back to "dynamo" (mirroring getNonInteractiveModel() behavior);
ensure shellQuote() usage remains unchanged for shell safety but add this
validation for defense-in-depth.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5ef4a1e4-1304-418c-84c2-02cb200e447e
📒 Files selected for processing (1)
bin/lib/onboard.js
K8s Testing Results - Dynamo ProviderTested on Kubernetes cluster with Dynamo vLLM inference endpoint. Onboard LogsTesting Dynamo InferenceFrom inside the sandbox, the Dynamo endpoint is accessible via # List available models
wget -qO- http://host.openshell.internal:8000/v1/models
# {"object":"list","data":[{"id":"meta-llama/Llama-3.1-8B-Instruct","object":"model","created":1774352462,"owned_by":"nvidia"}]}
# Test chat completions
wget -qO- --post-data='{"model":"meta-llama/Llama-3.1-8B-Instruct","messages":[{"role":"user","content":"Say hello"}],"max_tokens":20}' \
--header='Content-Type: application/json' \
http://host.openshell.internal:8000/v1/chat/completions
# {"choices":[{"message":{"content":"Hello. How can I assist you today?","role":"assistant"}}],"model":"meta-llama/Llama-3.1-8B-Instruct"}Environment Variables UsedNEMOCLAW_NON_INTERACTIVE: "1"
NEMOCLAW_PROVIDER: "dynamo"
NEMOCLAW_DYNAMO_ENDPOINT: "http://host.openshell.internal:8000/v1"
NEMOCLAW_DYNAMO_MODEL: "meta-llama/Llama-3.1-8B-Instruct"
NEMOCLAW_SANDBOX_NAME: "my-assistant"
NEMOCLAW_POLICY_MODE: "skip"✅ Dynamo provider integration working correctly. |
|
@rwipfelnv can you fix the conflicts, please? |
|
Closing this PR as it has been superseded by #648 which added support for "Other OpenAI-compatible endpoint" via The Thanks to everyone who reviewed and provided feedback! |
|
Superseded by #648 |
…VIDIA#365) Closes NVIDIA#362 Split the CONNECT log emission in the proxy so denied connections log immediately as CONNECT while allowed connections defer logging until after the L7 config check. When L7 inspection follows, emit CONNECT_L7 instead of CONNECT so log consumers can distinguish standalone L4 policy decisions from tunnel lifecycle events.
* feat: add Kubernetes testing infrastructure Add k8s-testing/ directory with scripts and manifests for testing NemoClaw on Kubernetes with Dynamo vLLM inference. Includes: - test-installer.sh: Public installer test (requires unattended install support) - setup.sh: Manual setup from source for development - Pod manifests for Docker-in-Docker execution Architecture: OpenShell runs k3s inside Docker, so we use DinD pods to provide the Docker daemon on Kubernetes. Signed-off-by: rwipfelnv * fix: add socat proxy for K8s DNS isolation workaround OpenShell's nested k3s cluster cannot resolve Kubernetes DNS names, so inference requests fail with 502 Bad Gateway. This adds: - socat TCP proxy setup in setup.sh to forward localhost:8000 to the K8s vLLM service endpoint - Provider configuration using host.openshell.internal:8000 which resolves to the workspace container from inside k3s - Documentation explaining the network architecture and workaround - Updated env var names to match PR #318 (NEMOCLAW_NON_INTERACTIVE) - cgroup v2 compatibility fix for Docker daemon - Removed memory limits that caused OOM Tested: Inference requests from sandboxes now route correctly through the socat proxy to the Dynamo vLLM endpoint. Depends on: #318 (non-interactive mode), #365 (Dynamo provider) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: NemoKlaw - NemoClaw on Kubernetes with Dynamo support Complete K8s deployment solution for NemoClaw: - nemoklaw.yaml: Pod manifest with DinD, init containers, hostPath storage - install.sh: Interactive installer with preflight checks - Rename k8s-testing -> k8s, move old files to dev/ Key learnings: - hostPath storage (/mnt/k8s-disks) avoids ephemeral storage eviction - Init containers for docker config, openshell CLI, NemoClaw build - Workspace container installs apt packages at runtime (can't share via volumes) - socat proxy bridges K8s DNS to nested k3s (host.openshell.internal) Tested successfully with Dynamo vLLM backend on EKS. Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> * fix: rename NemoKlaw to NemoClaw and document known limitations Address PR feedback: - Rename NemoKlaw -> NemoClaw (avoid confusing naming) - Rename nemoklaw.yaml -> nemoclaw-k8s.yaml - Fix hardcoded endpoint to use generic example - Remove log file from repo - Document known limitations (HTTPS proxy issue) - Update README with accurate status of what works/doesn't work Signed-off-by: rwipfelnv Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: update DYNAMO_HOST to vllm-agg-frontend The aggregated frontend service is the correct endpoint for Dynamo vLLM inference. Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> * docs: add Using NemoClaw section with CLI commands - Add workspace shell access command - Add sandbox status/logs/list commands - Add chat completion test example - Rename section from "What Can You Do?" to "Using NemoClaw" Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> * refactor(k8s): simplify deployment to use official installer - Use official NemoClaw installer (`curl | bash`) instead of git clone/build - Switch to `custom` provider from PR #648 (supersedes dynamo-specific provider) - Remove k8s/dev/ directory (no longer needed for testing) - Use emptyDir volumes for portability across clusters - Add /etc/hosts workaround for endpoint validation during onboarding - Update README with verification steps for local inference Tested end-to-end with Dynamo vLLM backend. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(k8s): resolve lint errors in yaml and markdown - Remove multi-document YAML (move namespace creation to README) - Add language specifier to fenced code block (```text) - Add blank lines before lists per markdownlint rules Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * docs(k8s): add experimental warning and clarify requirements - Add explicit experimental warning at top of README - Clarify this is for trying NemoClaw on k8s, not production - Document privileged pod and DinD requirements upfront - Add resource requirements to prerequisites Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> --------- Signed-off-by: rwipfelnv Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com> Co-authored-by: KJ <kejones@nvidia.com>
* feat: add Kubernetes testing infrastructure Add k8s-testing/ directory with scripts and manifests for testing NemoClaw on Kubernetes with Dynamo vLLM inference. Includes: - test-installer.sh: Public installer test (requires unattended install support) - setup.sh: Manual setup from source for development - Pod manifests for Docker-in-Docker execution Architecture: OpenShell runs k3s inside Docker, so we use DinD pods to provide the Docker daemon on Kubernetes. Signed-off-by: rwipfelnv * fix: add socat proxy for K8s DNS isolation workaround OpenShell's nested k3s cluster cannot resolve Kubernetes DNS names, so inference requests fail with 502 Bad Gateway. This adds: - socat TCP proxy setup in setup.sh to forward localhost:8000 to the K8s vLLM service endpoint - Provider configuration using host.openshell.internal:8000 which resolves to the workspace container from inside k3s - Documentation explaining the network architecture and workaround - Updated env var names to match PR NVIDIA#318 (NEMOCLAW_NON_INTERACTIVE) - cgroup v2 compatibility fix for Docker daemon - Removed memory limits that caused OOM Tested: Inference requests from sandboxes now route correctly through the socat proxy to the Dynamo vLLM endpoint. Depends on: NVIDIA#318 (non-interactive mode), NVIDIA#365 (Dynamo provider) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: NemoKlaw - NemoClaw on Kubernetes with Dynamo support Complete K8s deployment solution for NemoClaw: - nemoklaw.yaml: Pod manifest with DinD, init containers, hostPath storage - install.sh: Interactive installer with preflight checks - Rename k8s-testing -> k8s, move old files to dev/ Key learnings: - hostPath storage (/mnt/k8s-disks) avoids ephemeral storage eviction - Init containers for docker config, openshell CLI, NemoClaw build - Workspace container installs apt packages at runtime (can't share via volumes) - socat proxy bridges K8s DNS to nested k3s (host.openshell.internal) Tested successfully with Dynamo vLLM backend on EKS. Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> * fix: rename NemoKlaw to NemoClaw and document known limitations Address PR feedback: - Rename NemoKlaw -> NemoClaw (avoid confusing naming) - Rename nemoklaw.yaml -> nemoclaw-k8s.yaml - Fix hardcoded endpoint to use generic example - Remove log file from repo - Document known limitations (HTTPS proxy issue) - Update README with accurate status of what works/doesn't work Signed-off-by: rwipfelnv Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: update DYNAMO_HOST to vllm-agg-frontend The aggregated frontend service is the correct endpoint for Dynamo vLLM inference. Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> * docs: add Using NemoClaw section with CLI commands - Add workspace shell access command - Add sandbox status/logs/list commands - Add chat completion test example - Rename section from "What Can You Do?" to "Using NemoClaw" Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> * refactor(k8s): simplify deployment to use official installer - Use official NemoClaw installer (`curl | bash`) instead of git clone/build - Switch to `custom` provider from PR NVIDIA#648 (supersedes dynamo-specific provider) - Remove k8s/dev/ directory (no longer needed for testing) - Use emptyDir volumes for portability across clusters - Add /etc/hosts workaround for endpoint validation during onboarding - Update README with verification steps for local inference Tested end-to-end with Dynamo vLLM backend. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(k8s): resolve lint errors in yaml and markdown - Remove multi-document YAML (move namespace creation to README) - Add language specifier to fenced code block (```text) - Add blank lines before lists per markdownlint rules Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * docs(k8s): add experimental warning and clarify requirements - Add explicit experimental warning at top of README - Clarify this is for trying NemoClaw on k8s, not production - Document privileged pod and DinD requirements upfront - Add resource requirements to prerequisites Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> --------- Signed-off-by: rwipfelnv Signed-off-by: Robert Wipfel <rwipfel@nvidia.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com> Co-authored-by: KJ <kejones@nvidia.com>
Summary
NEMOCLAW_PROVIDER=dynamooption for connecting to external vLLM endpoints (e.g., Dynamo on Kubernetes)New Environment Variables
NEMOCLAW_PROVIDER=dynamoNEMOCLAW_DYNAMO_ENDPOINThttp://dynamo-svc.default.svc.cluster.local:8000/v1)NEMOCLAW_DYNAMO_MODELExample Usage
Security
http://orhttps://protocols are acceptedshQuote()helper prevents shell injection in endpoint URLsTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes / Improvements