Skip to content

feat(sandbox): support policy discovery and restrictive defaults on sandbox containers#84

Merged
drew merged 8 commits intomainfrom
82-sandbox-policy-discovery/drew
Mar 5, 2026
Merged

feat(sandbox): support policy discovery and restrictive defaults on sandbox containers#84
drew merged 8 commits intomainfrom
82-sandbox-policy-discovery/drew

Conversation

@drew
Copy link
Copy Markdown
Collaborator

@drew drew commented Mar 4, 2026

Closes #82

Summary

  • Allow sandboxes to operate without a pre-configured policy by supporting three resolution modes: policy from gateway (unchanged), policy from disk (/etc/navigator/policy.yaml), or a hardcoded restrictive default that blocks all network access
  • When the sandbox discovers policy locally, it syncs to the gateway via UpdateSandboxPolicy and re-reads the canonical version, keeping the gateway as the authoritative source
  • Pass NEMOCLAW_SANDBOX_NAME env var to sandbox pods so the sandbox can identify itself for the UpdateSandboxPolicy RPC

Changes

navigator-policy (crates/navigator-policy/src/lib.rs)

  • Added restrictive_default_policy() - returns a hardcoded policy with filesystem access, landlock, process identity, but no network policies (all network blocked) and no inference
  • Added CONTAINER_POLICY_PATH constant (/etc/navigator/policy.yaml) - well-known path for container-shipped policies
  • Added 12 unit tests

navigator-server (crates/navigator-server/src/grpc.rs)

  • create_sandbox: removed spec.policy.is_none() rejection - policy is now optional
  • get_sandbox_policy: returns policy: None, version: 0 when no policy is configured (instead of erroring)
  • update_sandbox_policy: when spec.policy is None (no baseline), skips static field/network mode validation and backfills spec.policy on the stored sandbox so future updates can validate against it
  • Added 6 unit tests

navigator-server (crates/navigator-server/src/sandbox/mod.rs)

  • Threaded sandbox_name through sandbox_to_k8s_specsandbox_template_to_k8sinject_pod_templateupdate_container_envbuild_env_listapply_required_env
  • Added NEMOCLAW_SANDBOX_NAME env var to the pod spec

navigator-sandbox

  • Cargo.toml: Added navigator-policy dependency
  • main.rs: Added --sandbox-name / NEMOCLAW_SANDBOX_NAME CLI arg
  • grpc_client.rs: Changed fetch_policy to return Option<ProtoSandboxPolicy> (None = no policy configured); added sync_policy() method
  • lib.rs: Added sandbox_name param to run_sandbox and load_policy; new discover_policy_from_disk_or_default(), discover_policy_from_path(), sync_discovered_policy() functions; 4 new tests

Policy Resolution Flow

gRPC mode:
  1. Call GetSandboxPolicy
  2. If policy returned → use it (existing behavior)
  3. If policy is None (version 0):
     a. Try to read /etc/navigator/policy.yaml
     b. If found → parse YAML, sync to gateway via UpdateSandboxPolicy
     c. If not found → use restrictive_default_policy(), sync to gateway
     d. Re-fetch from gateway to get canonical version/hash
     e. Proceed with the fetched policy

Test Plan

  • 22 new unit tests across 3 crates (12 policy, 4 sandbox, 6 server)
  • All 228+ existing tests continue to pass
  • mise run pre-commit passes (format, lint, tests, license checks)

@drew drew self-assigned this Mar 4, 2026
@drew drew requested a review from johntmyers March 4, 2026 05:00
@drew drew added test:e2e Requires end-to-end coverage labels Mar 4, 2026
drew added 3 commits March 4, 2026 14:37
…andbox containers

Allow sandboxes to operate without a pre-configured policy by supporting
three resolution modes:

1. Policy provided at create time - sandbox loads from gateway (unchanged)
2. Policy null, found on disk at /etc/navigator/policy.yaml - sandbox reads
   from disk, syncs to gateway, reads back canonical version
3. Policy null, no disk policy - sandbox uses hardcoded restrictive default
   (all network blocked), syncs to gateway

Key changes:
- Add restrictive_default_policy() and CONTAINER_POLICY_PATH to navigator-policy
- Make spec.policy optional in gateway create_sandbox
- Modify UpdateSandboxPolicy to handle no-baseline case (backfill spec.policy)
- Pass NEMOCLAW_SANDBOX_NAME env var to sandbox containers
- Add sync_policy() gRPC client method for sandbox-to-gateway policy sync
- Add disk discovery fallback in sandbox load_policy()

Closes #82
Consolidate the sync + re-fetch calls during policy discovery into a
single TLS channel, reducing startup from 3 separate connections to 2.
…ings, harden scripts

- Add navigator-policy crate to Dockerfile.base build cache layer
- Add dev-sandbox-policy.yaml to Dockerfile.base COPY step
- Use exact container name matching with health checks in cluster-deploy-fast
- Add navigator-policy and dev-sandbox-policy.yaml to sandbox fingerprint
- Implement fail-fast for parallel image builds in cluster-deploy-fast
- Collapse nested if-let in kubeconfig rewrite (clippy collapsible_if)
- Backtick-quote NemoClaw in doc comment (clippy doc_markdown)
@drew drew force-pushed the 82-sandbox-policy-discovery/drew branch from 2fc850c to 1ead5eb Compare March 4, 2026 22:37
drew added 5 commits March 4, 2026 14:49
Move dev-sandbox-policy.rego into crates/navigator-policy/ (the
canonical policy crate) and dev-sandbox-policy.yaml into
deploy/docker/sandbox/ where it is baked into the container image
at /etc/navigator/policy.yaml.

This eliminates loose config files from the repo root and co-locates
the rego rules with the policy crate that owns them. The default
policy YAML now ships inside the sandbox container so sandboxes
without an explicit gateway-provided policy can discover it on disk.

Updated all include_str! paths, Dockerfiles, build-script fingerprints,
architecture docs, and agent skill references.
…name to sandbox

Remove the compile-time embed of dev-sandbox-policy.yaml from the
navigator-policy crate so the CLI and TUI no longer implicitly fall
back to the dev policy. Users must now explicitly pass --policy or set
NEMOCLAW_SANDBOX_POLICY; otherwise no policy is sent and the server /
sandbox applies its own default (disk discovery or restrictive default).

Also rename the sandbox_name parameter to sandbox throughout
navigator-sandbox and update the env var from NEMOCLAW_SANDBOX_NAME
to NEMOCLAW_SANDBOX.
Fix sandbox_exec TTY detection so interactive commands like claude work
when launched through mise or other non-terminal wrappers. The old code
relied solely on stdout.is_terminal() which returns false in many valid
interactive contexts. Add explicit --tty/--no-tty overrides and default
sandbox.sh to --tty since it always intends interactive use.

Also fix env var mismatch (NEMOCLAW_SANDBOX_NAME -> NEMOCLAW_SANDBOX)
that caused sandbox pods to crash on startup, improve deploy state
tracking with container ID detection, simplify image eviction logic,
and add tracing to the SSH tunnel handshake path.
@drew drew merged commit b62c42f into main Mar 5, 2026
10 checks passed
@drew drew deleted the 82-sandbox-policy-discovery/drew branch March 5, 2026 01:17
drew added a commit that referenced this pull request Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(sandbox): support policy discovery and restrictive defaults on sandbox containers

1 participant