Skip to content

fix(sandbox): allow DNS resolution and direct TCP from sandbox netns#675

Closed
kosaku-sim wants to merge 4 commits intoNVIDIA:mainfrom
simount:fix/sandbox-dns-udp-accept
Closed

fix(sandbox): allow DNS resolution and direct TCP from sandbox netns#675
kosaku-sim wants to merge 4 commits intoNVIDIA:mainfrom
simount:fix/sandbox-dns-udp-accept

Conversation

@kosaku-sim
Copy link
Copy Markdown

Summary

Sandbox processes that bypass HTTP_PROXY (e.g., Node.js ws library used by @slack/socket-mode) fail with getaddrinfo EAI_AGAIN because:

  1. The sandbox netns iptables rules unconditionally REJECT all UDP traffic, including DNS (port 53) to the cluster CoreDNS
  2. Even after DNS resolves, direct TCP 443 connections are also REJECT'd
  3. The sandbox netns routes all traffic via the veth host side (10.200.0.1), but no IP forwarding or NAT is configured for non-proxy traffic

This PR fixes all three layers.

Changes

1. UDP DNS ACCEPT rule (netns.rs)

Insert an iptables ACCEPT rule for UDP port 53 to the cluster nameserver (read from /etc/resolv.conf or OPENSHELL_DNS_SERVER env var) before the blanket UDP REJECT rule.

2. IP forwarding + MASQUERADE (netns.rs)

Enable IP forwarding on the host-side veth interface and add MASQUERADE rules so that:

  • DNS packets from the sandbox (src 10.200.0.2) appear to come from the pod IP when reaching CoreDNS
  • Return packets are correctly routed back through the veth

3. Direct TCP 443 via OPENSHELL_DIRECT_TCP_HOSTS (netns.rs)

New environment variable OPENSHELL_DIRECT_TCP_HOSTS (comma-separated hostnames). At sandbox netns setup:

  • Resolve each hostname via the system DNS (runs in the pod netns, not sandbox)
  • Add iptables ACCEPT for TCP 443 to each resolved IP (sandbox side)
  • Add MASQUERADE + FORWARD rules (host side) for return routing

This provides direct connectivity for proxy-unaware libraries while keeping the default deny-all policy intact for unlisted destinations.

4. Landlock warning log level (landlock.rs)

Downgrade the "Landlock filesystem sandbox is UNAVAILABLE" message from warn! to debug!. The ANSI-colored warning is emitted on every kubectl exec invocation, and when piped through SSH or kubectl stdin, corrupts file contents with \x1b escape codes (e.g., JSON config files become unparseable).

Security considerations

  • DNS ACCEPT is scoped to a single IP (/32) on UDP port 53 only
  • TCP 443 ACCEPT is scoped to resolved IPs of explicitly listed hostnames only
  • MASQUERADE is applied only to DNS and listed TCP endpoints, not broadly
  • All other UDP and TCP bypass traffic remains REJECT'd
  • An attacker inside the sandbox could theoretically perform DNS tunneling to the CoreDNS IP; this is an accepted trade-off vs broken WebSocket connectivity

Testing

Tested on NemoClaw (OpenClaw on AWS) with:

  • Instance: c7g.large (Graviton ARM64), Ubuntu 24.04
  • OpenShell: latest (k3s in Docker)
  • CoreDNS: 10.43.0.10 (default k3s)
  • Workload: Slack Socket Mode via @slack/socket-mode (Node.js ws library)
OPENSHELL_DIRECT_TCP_HOSTS=wss-primary.slack.com,wss-backup.slack.com,api.slack.com

# Before patch:
dns.resolve4("wss-primary.slack.com") → ECONNREFUSED
dns.lookup("wss-primary.slack.com")   → EAI_AGAIN

# After patch:
dns.resolve4("wss-primary.slack.com") → OK: ['35.74.10.178', ...]
dns.lookup("api.slack.com")           → OK: 52.192.46.121
net.connect(443, '35.74.10.178')      → TCP OK
[slack] socket mode connected         → ✅

Related issues

Signed-off-by: Kazuya Kosaku kosaku@simount.com

kosaku-sim and others added 4 commits March 30, 2026 19:08
The sandbox iptables rules unconditionally REJECT all UDP traffic,
which blocks DNS resolution for libraries that bypass HTTP_PROXY
(e.g. Node.js ws used by @slack/socket-mode).

Add an ACCEPT rule for UDP port 53 to the nameserver from
/etc/resolv.conf (or OPENSHELL_DNS_SERVER env override) before
the blanket UDP REJECT, so sandboxed processes can resolve
external hostnames without opening a broad UDP hole.

Fixes: NVIDIA/NemoClaw#409
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The DNS ACCEPT iptables rule alone is insufficient because the
sandbox netns routes everything via 10.200.0.1 (host veth).
DNS UDP packets reach the host side but the pod network cannot
route responses back to 10.200.0.2 (sandbox IP).

Enable IP forwarding on the host veth and add MASQUERADE so DNS
packets appear to come from the pod IP, allowing CoreDNS to
respond correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Landlock WARN log is emitted to stderr with ANSI color codes on
every kubectl exec invocation. When SSH or kubectl exec pipes stdin
to sandbox commands, the ANSI output corrupts file contents (e.g.,
openclaw.json config). Downgrading to debug prevents this pollution
in default log levels while keeping the diagnostic available with
RUST_LOG=debug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Libraries like Node.js ws (used by @slack/socket-mode) resolve DNS
then connect directly to the resolved IP on TCP 443, ignoring
HTTP_PROXY. The sandbox iptables REJECT all bypass TCP, breaking
these connections even after DNS resolution succeeds.

Add OPENSHELL_DIRECT_TCP_HOSTS env var (comma-separated hostnames).
At sandbox netns setup, resolve these hosts and install:
- iptables ACCEPT for TCP 443 to resolved IPs (sandbox side)
- MASQUERADE + FORWARD rules (host side) for return routing

This pairs with the DNS ACCEPT rule from the previous commit to
provide full direct connectivity for proxy-unaware libraries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@kosaku-sim kosaku-sim requested a review from a team as a code owner March 30, 2026 13:05
@github-actions
Copy link
Copy Markdown

Thank you for your interest in contributing to OpenShell, @kosaku-sim.

This project uses a vouch system for first-time contributors. Before submitting a pull request, you need to be vouched by a maintainer.

To get vouched:

  1. Open a Vouch Request discussion.
  2. Describe what you want to change and why.
  3. Write in your own words — do not have an AI generate the request.
  4. A maintainer will comment /vouch if approved.
  5. Once vouched, open a new PR (preferred) or reopen this one after a few minutes.

See CONTRIBUTING.md for details.

@github-actions github-actions bot closed this Mar 30, 2026
@github-actions
Copy link
Copy Markdown

Thank you for your submission! We ask that you sign our Developer Certificate of Origin before we can accept your contribution. You can sign the DCO by adding a comment below using this text:


I have read the DCO document and I hereby sign the DCO.


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the DCO Assistant Lite bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenShell Sandbox Networking Fails CoreDNS unreachable inside sandbox (10.43.0.10:53 connection refused) on both macOS and Linux

1 participant