Add NemoClaw sandbox + LiteLLM proxy integration#2
Open
kosaku-sim wants to merge 18 commits intomainfrom
Open
Conversation
… AI execution Integrates NVIDIA NemoClaw (OpenShell) and LiteLLM into the Linux CloudFormation template to provide defense-in-depth isolation: OpenClaw runs inside a network-restricted sandbox that can only reach the LiteLLM proxy on localhost:4000, which proxies all model requests to Amazon Bedrock via IAM role. Closes #1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace heredoc configs with python/printf generation, remove comments and blank lines from UserData to reduce from ~40KB to ~25KB base64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move inline NemoClaw install, LiteLLM config, network policy, and sandbox gateway service code from CloudFormation UserData to external script (scripts/setup-nemoclaw-litellm.sh). UserData now downloads and executes the script when EnableSandbox=true, reducing raw size from ~21KB to ~12KB (well within the 16KB EC2 limit). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This script is downloaded and executed by UserData when EnableSandbox=true. Contains LiteLLM proxy install/config, NemoClaw sandbox setup, network policy, and systemd services. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
raw.githubusercontent.com requires %2F for branch names containing slashes (feature/nemoclaw-litellm-integration). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NemoClaw installer (nc.sh) requires HOME to be set. CloudFormation UserData runs as root without HOME exported, causing 'unbound variable' error with set -e. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use `nemoclaw onboard --non-interactive` instead of manual sandbox creation - Register LiteLLM as OpenShell provider via `openshell provider create` - Set inference route via `openshell inference set` (not config file) - Use `host.openshell.internal` for sandbox-to-host LiteLLM access - Bind LiteLLM on 0.0.0.0 so sandbox can reach it - Add persistent port forward via systemd wrapper service - Pre-stage OpenClaw config with allowedOrigins for SSM access - Update fallback restart to use openshell-forward service Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…all steps - Move OpenClaw config writing BEFORE nc.sh install (onboard copies it) - Remove explicit `nemoclaw onboard` (nc.sh --non-interactive does it) - Add /root/.local/bin to PATH after NemoClaw install - Add PATH to systemd service environment - Fix sandbox name detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove set -e (installer stops at [4/7] without NVIDIA_API_KEY, expected) - Find NVM-installed node and add to PATH after install - Wait for sandbox to become ready before configuring provider - Add comments explaining NIM API key skip behavior Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The NemoClaw installer retry (||) re-runs onboard which recreates the gateway and destroys the existing sandbox. Run once only. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rol UI NemoClaw sandbox handles agent execution (messaging, CLI, tools). Host OpenClaw gateway serves Control UI on port 18789 (auth=none). This avoids the device identity bug in NemoClaw's OpenClaw 2026.3.11. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Change api from "openai" to "openai-completions" (valid enum value) - Remove invalid "auth":"none" from provider config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OpenClaw npm package alone uses ~22GB (includes Chromium, Control UI, plugins). Combined with NemoClaw Docker images and LiteLLM, 30GB is insufficient and causes disk full issues. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Skip Node.js/npm install on host when EnableSandbox=true (saves 22GB) - Update OpenClaw inside NemoClaw sandbox to latest (fixes device identity bug) - Patch sandbox config to auth.mode=none via overlayfs - Set up persistent openshell forward for port 18789 - Run messaging plugin enablement inside sandbox - Revert EBS to 30GB (sufficient without host npm install) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ture Replace the NemoClaw onboard-dependent setup with a direct openshell CLI workflow that uses the managed inference proxy (https://inference.local). Setup script changes: - Use openshell gateway/provider/inference/sandbox commands directly - Route LLM requests via https://inference.local (bypasses sandbox proxy) - Fix host.openshell.internal IP (detect Docker network gateway dynamically) - Patch Sandbox CRD hostAliases for correct host resolution - Deliver config via SSH tee (replaces brittle overlayfs patching) - SSH LocalForward systemd service (replaces openshell forward) - Add inotify sysctl limits (prevents k3s "too many open files" crash) CloudFormation changes: - Add inotify limits before Docker install in sandbox mode - Fix fallback to restart openshell-forward service - Standardize sandbox name to "openclaw" - Deliver SOUL.md via SSH in sandbox mode - Fix dashboard URL format (?token= -> #token=) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NemoClaw + LiteLLM setup takes longer than 20 minutes due to: - apt-get upgrade - LiteLLM pip install - NemoClaw installer + Docker image pulls - OpenShell gateway + sandbox creation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three issues found during end-to-end CloudFormation deploy test: 1. openshell sandbox create --no-tty hangs on SSH session → Run in background, wait for Ready status, then kill 2. Port 18789 conflict: docker-proxy (openshell gateway) occupies it → SSH LocalForward binds on 18790 instead → SSM port forward targets 18790, maps to local 18789 3. openshell-forward service (User=ubuntu) can't find gateway metadata → Run as root with HOME=/root and PATH including nvm node → SSH config placed in /root/.ssh/ (not ubuntu's) Also fix all SSH commands to run as root (consistent with setup context). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
概要
NVIDIA NemoClaw(OpenShell)サンドボックスと LiteLLM プロキシを Linux CloudFormation テンプレートに統合。OpenClaw エージェントをカーネルレベルで隔離されたサンドボックス内で実行し、OpenShell の managed inference proxy 経由で Amazon Bedrock にアクセスします。
ホストに OpenClaw (22GB) をインストールせず、サンドボックスイメージ内の OpenClaw のみを使用することで、ディスク使用量を約 5GB に抑えます。
EnableSandbox=false(従来)と EnableSandbox=true(本PR)の比較
https://inference.localのみ許可/sandboxと/tmpのみ書き込み可従来の Docker sandbox は OpenClaw のアプリケーション機能で、エージェントのコード実行を Docker コンテナ内で行う仕組みです。NemoClaw sandbox はそれとは異なり、OpenClaw 自体をカーネルレベルで隔離された環境に閉じ込めるため、Docker-in-Docker は不要です。
アーキテクチャ
セキュリティモデル(3層のLinuxカーネル隔離)
/sandboxと/tmpのみ書き込み可)https://inference.localのみ許可変更内容
scripts/setup-nemoclaw-litellm.sh— 完全書き直し10ステップの自動化スクリプト:
host.openshell.internalのIP修正(Docker network gateway IP を動的検出)https://inference.local/v1をbaseUrlに使用)+ ゲートウェイ起動clawdbot-bedrock.yaml— CloudFormation テンプレートEnableSandboxパラメータ(デフォルト: true)で NemoClaw+LiteLLM を制御openshell-forwardサービス再起動に修正openclawに統一?token=→#token=に修正ドキュメント
SECURITY.md: NemoClaw + LiteLLM アーキテクチャのセキュリティドキュメントTROUBLESHOOTING.md: NemoClaw/LiteLLM トラブルシューティングセクション追加README.md: アーキテクチャ図とパラメータ表を更新DEPLOYMENT.md: NemoClaw/LiteLLM の確認手順を追加解決した技術課題
https://inference.local(managed inference proxy)経由に変更.openclaw/identityディレクトリ権限不足?token=がリダイレクトで消失#token=(フラグメント)形式に変更fs.inotify.max_user_instances=512に設定provider,auth.mode等は無効)gateway/models.providers/agents.defaults)を使用テスト計画
https://inference.local経由で LiteLLM → Bedrock 接続確認EnableSandbox=trueで新規 CloudFormation スタックデプロイ(エンドツーエンド)EnableSandbox=falseで既存 Bedrock 直接接続フローが正常動作Closes #1
🤖 Generated with Claude Code