Local AI agent sandbox. Run isolated Linux VMs on your Mac in ~60ms. No cloud costs. VM-level isolation via Virtualization.framework. Works with MCP clients that support local stdio servers (Claude Code, Claude Desktop, Cursor).
An MCP server that gives AI agents a sandboxed Linux environment using Apple Containerization (Virtualization.framework). Each sandbox is a real VM — not a container sharing your kernel — that boots in ~700ms and executes commands in ~60ms via a persistent shell over vsock.
Compared to cloud sandboxes (as of early 2025):
| Exec latency | Cost | Isolation | |
|---|---|---|---|
| This (local) | ~60ms | Local hardware | VM (Virtualization.framework) |
| E2B | ~150ms + network | $0.18/hr | Firecracker microVM |
| Daytona | ~90ms + network | Usage-based | Docker container |
Once registered, your MCP client can use the sandbox tools directly:
Agent: exec(command="uname -a")
→ Linux mcp-sb-abc123 6.12.6 #1 SMP aarch64 Linux
Agent: install(packages="python3 py3-pip")
→ Installed python3 py3-pip (1230ms)
Agent: exec(command="python3 -c 'print(sum(range(1000)))'")
→ 499500
Agent: bg(command="python3 -m http.server 8000")
→ Started [bg-a1b2c3] PID 42
Agent: expose(port=8000)
→ Forwarding localhost:8000 → 'default':8000
Open http://localhost:8000
Cold boot is ~700ms, subsequent commands ~60ms each.
- Apple Silicon Mac (M1+)
- macOS 15 Sequoia+
- Python 3.11+
- uv (for packaging)
# Download and install the container CLI
curl -LO https://github.com/apple/containerization/releases/download/v0.9.0/container-v0.9.0.pkg
sudo installer -pkg container-v0.9.0.pkg -target /
# Start the container system (downloads kernel on first run)
container system start
# Verify it works
time container run --rm alpine echo "hello" # ~700ms cold bootThe included Containerfile.mcp-dev builds an Alpine image pre-loaded with Python, Node.js, Go, Rust, and standard build tools:
cd sandbox-mcp
container build -t mcp-dev -f Containerfile.mcp-dev .uv syncClaude Code:
claude mcp add sandbox -- uv --directory /path/to/sandbox-mcp run sandbox-mcpManual (~/.claude.json):
{
"mcpServers": {
"sandbox": {
"type": "stdio",
"command": "/path/to/uv",
"args": ["--directory", "/path/to/sandbox-mcp", "run", "sandbox-mcp"]
}
}
}Apple's containerization repo includes a stripped-down Linux kernel config. Compiling it yourself doesn't meaningfully improve exec latency — the ~700ms floor is VM lifecycle overhead (Virtualization.framework + EXT4 + network + vminitd), not kernel boot. The real win is keeping VMs warm and using persistent shell exec (~60ms).
That said, if you want a smaller kernel:
git clone https://github.com/apple/containerization.git
cd containerization/kernel
make # ~3 min on M-series
container system kernel set --binary ./vmlinux
container system stop && container system startAgent ──MCP/stdio──▶ sandbox_mcp_server.py (FastMCP)
│
├── SandboxManager
│ ├── _sandboxes: dict[name, Sandbox]
│ ├── _port_forwards: dict[port, PortForward]
│ ├── _sync_jobs: dict[id, SyncJob]
│ └── _cleanup_loop (idle TTL + child TTL)
│
├── SandboxCtlServer (per-parent UDS listener)
│ └── NDJSON over /run/sandbox-ctl.sock
│ → spawn, list, exec, destroy, run
│
└── Sandbox (per-VM)
├── PersistentShell (container exec -i <name> sh)
├── _bg_processes: dict[id, Process]
└── _audit_log: deque
Latency breakdown:
- Cold boot: ~700ms (Virtualization.framework + EXT4 + network + vminitd)
- Warm exec: ~60ms (command piped to persistent shell via vsock)
- Why warm is fast: Each sandbox holds open a
container exec -i <name> shprocess. Commands are written to stdin with a unique end-marker, output is read until the marker appears. No process spawn overhead per command.
Apple Containers v0.9.0 -p port publishing is broken (TCP connects but data never flows), and VM IPs are not routable from the host. Port forwarding works via asyncio TCP proxy:
exposestarts a local TCP server on127.0.0.1:<host_port>- Each incoming connection spawns
container exec -i <name> nc 127.0.0.1 <container_port> - Data is piped bidirectionally between the client and the nc process via vsock
Sandboxes are named (default: "default"). Each gets isolated volumes for /workspace and package caches (apk, pip, npm). Caches persist across resets for fast reinstalls. Sandboxes can reach each other by name via /etc/hosts entries auto-injected when networking is available.
Configure per-sandbox-name resources in SANDBOX_PROFILES at the top of the server:
SANDBOX_PROFILES = {
"ml": {"cpus": 4, "memory": "2G"},
"build": {"cpus": 4, "memory": "1G"},
"nested": {"cpus": 2, "memory": "1G", "virtualization": True},
}The virtualization flag enables nested virtualization (--virtualization). GPU/Metal passthrough is not supported by Apple Containers — the kernel has CONFIG_DRM_VIRTIO_GPU disabled and the Swift framework doesn't use VZVirtioGraphicsDeviceConfiguration.
Sandboxes can spawn child sandboxes, controlled by SPAWN_POLICIES at the top of the server. Policies define per-parent limits: max concurrent children, lifetime spawn count, CPU/memory budgets, allowed images, and TTL. Unlisted sandbox names cannot spawn.
Children are lightweight — they skip cache volumes and get their own isolated workspace. They're auto-destroyed when their TTL expires or their parent is reset/destroyed.
Setting child_can_spawn: True in a policy allows children to spawn their own children (grandchildren), up to a depth of _MAX_SPAWN_GENERATION (default 2). Grandchild policies are derived automatically — halved concurrency/budget limits, no further sub-spawning. A tree-wide budget check ensures the root sandbox's CPU/memory envelope is never exceeded regardless of spawn depth. This is off by default and not recommended for most use cases.
When a sandbox has a spawn policy with inject_ctl: True (the default), the server mounts a UDS socket and the sandbox-ctl binary into the VM. Set inject_ctl: False to skip injection for sandboxes that don't need in-VM sub-launching. Code running inside the VM can then spawn/manage sibling containers:
sandbox-ctl ping # verify connection
sandbox-ctl spawn --image mcp-dev --cpus 1 --memory 256M # create child
sandbox-ctl list # show children
sandbox-ctl exec <child> -- echo hello # run in child
sandbox-ctl destroy <child> # tear down
sandbox-ctl run -- echo test # ephemeral: spawn + exec + destroyCommunication uses NDJSON over the mounted socket (/run/sandbox-ctl.sock). The host-side SandboxCtlServer handles requests and delegates to SandboxManager.
Sandbox-to-container mappings are saved to ~/.local/state/sandbox-mcp/state.json (schema v2). On restart, the server reconnects to any still-running containers from the previous session. Expired children are cleaned up on reconnect.
| Tool | Description |
|---|---|
exec |
Run a shell command (~60ms) |
python |
Execute Python code |
write_file |
Write a file to the sandbox |
read_file |
Read a file from the sandbox |
batch_write |
Write multiple files in one transfer |
install |
Install packages via apk |
env |
Manage persistent environment variables |
| Tool | Description |
|---|---|
bg |
Run a command in the background |
logs |
Read output from a background process |
kill |
Kill a background process |
| Tool | Description |
|---|---|
status |
Show pool and sandbox info |
health |
Quick liveness/disk/memory check across all sandboxes |
stats |
Show CPU/memory/disk usage for one sandbox |
reset |
Destroy and recreate (clean state) |
list_all |
List all active sandboxes |
destroy |
Permanently kill a sandbox |
clone |
Clone a running sandbox to a new name |
history |
Show recent command audit log |
| Tool | Description |
|---|---|
upload |
Copy files from host into sandbox |
download |
Copy files from sandbox to host |
git_clone |
Clone a git repo (with optional auth token) |
sync_start |
Watch and live-sync a host directory |
sync_stop |
Stop a running sync job |
| Tool | Description |
|---|---|
snapshot |
Save sandbox state as a reusable image |
restore |
Boot from a saved snapshot |
list_snapshots |
List available snapshots |
delete_snapshot |
Delete a saved snapshot image |
build_image |
Build a container image from a Containerfile |
images |
List all available container images |
| Tool | Description |
|---|---|
expose |
Forward a sandbox port to localhost (TCP proxy) |
unexpose |
Stop a port forward |
network_info |
Show IPs and connectivity between sandboxes |
| Tool | Description |
|---|---|
spawn |
Spawn a child sandbox under a parent |
children |
List child sandboxes of a parent |
destroy_child |
Destroy a child sandbox |
| File | Description |
|---|---|
sandbox_mcp_server.py |
Sandbox class, SandboxManager, MCP tool definitions |
cmd/sandbox-ctl/ |
In-container CLI for spawning sibling sandboxes (Go) |
pyproject.toml |
uv/hatchling packaging, entry point sandbox-mcp |
Containerfile.mcp-dev |
Alpine 3.23 dev image with Python, Node, Go, Rust |
tests/ |
pytest test suite |
Edit constants at the top of sandbox_mcp_server.py:
| Constant | Default | Description |
|---|---|---|
DEFAULT_IMAGE |
"mcp-dev" |
Container image for new sandboxes |
SANDBOX_CPUS |
2 |
Default CPU cores per sandbox |
SANDBOX_MEMORY |
"512M" |
Default memory per sandbox |
IDLE_TTL |
1800 |
Seconds before auto-destroying idle sandboxes |
DEFAULT_TIMEOUT |
30 |
Default command timeout in seconds |
MAX_OUTPUT |
50000 |
Max output bytes per command |
SPAWN_POLICIES |
{"default": {...}} |
Per-sandbox child spawn limits and permissions |
_MAX_SPAWN_GENERATION |
2 |
Maximum spawn depth (root → child → grandchild) |
uv run pytest tests/ -vTests run on Python 3.11, 3.12, and 3.13 via GitHub Actions. Pre-push:
uv run pytest tests/ -q && python3 -m compileall sandbox_mcp_server.py tests