-
Notifications
You must be signed in to change notification settings - Fork 411
Description
Problem Statement
LLMs do not know how to use the nemoclaw CLI by default. To enable agents to operate NemoClaw effectively — from basic provider management and sandbox creation through complex iterative policy refinement and BYOC workflows — we need a dedicated agent skill that teaches the CLI's command structure, guides multi-step workflows, and provides a fallback self-teaching mechanism via --help.
Additionally, the existing generate-sandbox-policy skill has drifted from the actual policy schema (critical: allowed_routing_hints vs allowed_routes field name mismatch that would cause agent-generated policies to fail parsing). Since the new CLI skill will delegate policy content authoring to generate-sandbox-policy, that skill must be fixed as part of this work.
Technical Context
The nemoclaw CLI (crates/navigator-cli) is a comprehensive tool with 30+ commands across 7 command groups (cluster, sandbox, provider, inference, policy, image, forward). The codebase already has 12 agent skills in .agents/skills/ following a well-established pattern: YAML frontmatter, numbered workflow steps, command reference tables, and concrete example scenarios. The new skill needs to cover basic CRUD operations, but more importantly, it must guide agents through multi-step workflows that involve coordinating across command groups (e.g., create providers → create sandbox with policy → monitor logs → pull policy → edit → push → verify reload).
The generate-sandbox-policy skill is a companion to the new CLI skill — the CLI skill handles command orchestration while generate-sandbox-policy handles policy content authoring. The CLI skill should cross-reference it at handoff points rather than duplicating policy semantics.
Affected Components
| Component | Key Files | Role |
|---|---|---|
| New CLI skill | .agents/skills/nemoclaw-cli/SKILL.md (to create) |
CLI workflow guidance for agents |
| Policy generation skill | .agents/skills/generate-sandbox-policy/SKILL.md, examples.md |
Companion skill — needs schema drift fixes |
| Architecture docs | architecture/security-policy.md |
Has same allowed_routing_hints error — fix alongside |
| Navigator CLI | crates/navigator-cli/src/main.rs, run.rs, ssh.rs |
CLI command definitions (reference for skill content) |
| Policy system | crates/navigator-policy/src/lib.rs, dev-sandbox-policy.yaml |
Authoritative schema (source of truth for fixes) |
Proposed Approach
Part 1: Fix generate-sandbox-policy schema drift
- Fix
allowed_routing_hints→allowed_routesin SKILL.md, examples.md, andarchitecture/security-policy.md - Reconcile default
read_onlypaths (decide whether skill should matchdev-sandbox-policy.yamlor keep tighter defaults with a note)
Part 2: Create nemoclaw-cli agent skill
- Follow the multi-step workflow archetype (like
build-from-issueandgenerate-sandbox-policy) - Organize into tiered workflows: basic operations → intermediate → advanced multi-step
- Cross-reference
generate-sandbox-policyat policy content handoff points - Include
--helpfallback mechanism for self-teaching - Consider supplementary reference file for full command tree
Key workflows to document:
- Provider management (CRUD, auto-creation from local credentials)
- Sandbox lifecycle (create, connect, sync, logs, delete)
- Policy iteration loop (logs → pull policy → delegate to generate-sandbox-policy → push → verify reload)
- BYOC pipeline (build image → push → create sandbox → port forward)
- Agent-assisted sandbox session (parallel monitoring + policy refinement)
Scope Assessment
- Complexity: Medium — substantial content but follows established patterns; mostly documentation with a critical bug fix
- Confidence: High — 12 existing skills provide a proven template, CLI structure is well-defined, schema drift is clearly identified
- Estimated files to change: 4-6 (new SKILL.md + optional reference.md, fix SKILL.md + examples.md in generate-sandbox-policy, fix security-policy.md)
- Issue type:
feat
Risks & Open Questions
- Skill length vs. context window: A comprehensive skill covering all workflows could be 600-800+ lines. Consider splitting into core SKILL.md with supplementary reference files.
- Trigger keyword collision: Need keywords that activate for CLI usage without colliding with
generate-sandbox-policy(which triggers on policy generation/authoring). - Provider auto-creation shortcut: Should the skill teach
ncl sandbox create -- claude(auto-creates providers) or always use explicit provider management? Probably both, with the shortcut as the quick-start path. - Dynamic vs static policy fields: Agents must understand which policy fields can be hot-reloaded (network_policies, inference) and which require sandbox recreation (filesystem_policy, landlock, process). Getting this wrong leads to confusing errors.
read_onlydefaults decision: Should thegenerate-sandbox-policyskill match the actual reference file (/proc+/var/log) or keep its tighter defaults (/proc/self, no/var/log) with an explicit note?
Test Considerations
- Manual validation: have an agent use both skills to execute each documented workflow end-to-end
- Verify the
--helpfallback works for commands deliberately omitted from the skill - Test that trigger keywords activate the correct skill and don't conflict
- Validate the policy iteration loop end-to-end: create sandbox → observe logs → pull policy → modify (via generate-sandbox-policy) → push → verify reload
- Verify the
allowed_routesfix by generating a policy with inference config and confirming it parses
Created by spike investigation. Use build-from-issue to plan and implement.