Skip to content

perf: SSH sandbox sessions have high interactive latency due to missing TCP_NODELAY and other inefficiencies #9

@pimlock

Description

@pimlock

Problem

Interactive SSH sandbox sessions exhibit noticeable keystroke latency (40-200ms per round-trip). This affects both input going to the sandbox and output coming back.

Root Causes

1. No TCP_NODELAY on any TCP stream (highest impact)

Neither the gateway-to-sandbox TCP connection (ssh_tunnel.rs) nor the proxy-to-gateway connection (navigator-cli/src/ssh.rs) nor the sandbox accepted socket (navigator-sandbox/src/ssh.rs) sets TCP_NODELAY.

With Nagle's algorithm enabled, the kernel buffers small writes (up to ~40ms) waiting to coalesce them into larger packets. In an interactive SSH session, every keystroke is a tiny packet. Combined with TCP delayed ACKs on the return path, this creates the classic Nagle + delayed-ACK interaction adding 40-200ms of latency per keystroke round-trip.

Fix: Add .set_nodelay(true) on all three TcpStream endpoints in the data path. This is the lowest-effort, highest-impact change.

2. Per-chunk async task spawn for PTY output

In navigator-sandbox/src/ssh.rs, every read() from the PTY master spawns a brand new tokio task to call handle.data(). For fast-scrolling output (e.g., cat of a large file, compiler output), this creates thousands of short-lived tasks with allocation, scheduling, and SSH packet framing overhead per task.

Fix: Use a single long-lived task sending data on the channel instead of spawning per-chunk.

3. Byte-at-a-time preface reading during connection setup

The NSSH1 handshake preface (~130 bytes) is read one byte at a time in both ssh_tunnel.rs and sandbox/ssh.rs. Each byte is a separate async read syscall. This adds noticeable delay to the initial connection time before the shell prompt appears.

Fix: Use BufReader or read into a larger buffer and scan for the newline delimiter.

4. Double user-space copy hop

The data path during the session is:

ssh binary <-pipe-> ssh-proxy process <-TCP/TLS-> gateway <-TCP-> sandbox

The ssh-proxy process is a separate process doing tokio::io::copy between stdin/stdout pipes and the gateway socket. Every byte passes through two user-space processes and at least two extra read/write syscall pairs.

Fix: This is architectural and lower priority, but could be addressed by integrating the proxy into the SSH client directly.

Proposed Fix Order

  1. TCP_NODELAY — 3 one-line changes, biggest impact
  2. Single forwarding task for PTY output — moderate effort, improves throughput
  3. Buffered preface reading — small effort, improves connection setup time
  4. Proxy architecture — larger effort, marginal improvement for interactive sessions

Originally by @drew on 2026-02-16T10:00:54.780-08:00

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions