fix(sandbox): relay WebSocket frames after HTTP 101 Switching Protocols#683
fix(sandbox): relay WebSocket frames after HTTP 101 Switching Protocols#683davidpeden3 wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
The L7 REST proxy treats 101 Switching Protocols as a generic 1xx informational response via is_bodiless_response(), forwarding the headers and returning to the HTTP parsing loop. After a 101, the connection has been upgraded (e.g. to WebSocket) and subsequent bytes are protocol frames, not HTTP requests. The relay loop either blocks or silently drops them. This patch: - Adds RelayOutcome::Upgraded variant to signal protocol upgrades - Detects 101 responses before the generic 1xx handler in relay_response(), capturing any overflow bytes read past the headers - Switches relay_rest() and relay_passthrough_with_credentials() to raw bidirectional TCP copy (tokio::io::copy_bidirectional) after receiving an Upgraded outcome - Adds a test verifying 101 response handling and overflow capture This enables WebSocket connections (OpenClaw node meshes, Discord/Slack bots) to work from inside fully sandboxed environments. Fixes: NVIDIA#652 Related: NVIDIA/NemoClaw#409 Signed-off-by: David Peden <davidpeden3@gmail.com>
|
All contributors have signed the DCO ✍️ ✅ |
|
I have read the DCO document and I hereby sign the DCO. |
|
recheck |
|
Hi @davidpeden3. Thank you for this. Out of curiosity, we've seen a lot of issues and had feedback that getting Slack, Discord, et al working in OpenClaw in OpenShell does not work because they won't go through the proxy. Did you have to do anything specific to get these providers to obey the proxy to begin with (obviously post-proxy use is an issue you are addressing). |
|
hey @johntmyers, honestly i haven't even gotten that far in my setup. i've been working on creating a mesh network where my gateway can distribute work to my nodes (and itself, of course) for a karpathy-style autoresearch project i'm working on to train local models (currently trying out nemotron) to learn my coding style. i've got three machines in the network. an m4 mac studio 128gb, an m5 mbp 128gb, and a win pc with a 5090 rtx. the nodes connect to the gateway via websockets. all nodes (including the gateway, of course) are running inside an openshell sandbox for security. the issue surfaced when i tried to pair nodes to the gateway (starting w/ the m4 to m5 connection). once i got that working, i then ran into #681 when trying to pair the 5090 pc to the m4 gateway. both fixes solved all of my connectivity issues end to end. so to be clear, i have not yet attempted to connect to a third party like slack. this was all internal communication between my nodes. pure openshell/openclaw. i will likely get to setting up slack later this week if you would like me to report back. |
The L7 REST proxy treats 101 Switching Protocols as a generic 1xx informational response via is_bodiless_response(), forwarding the headers and returning to the HTTP parsing loop. After a 101, the connection has been upgraded (e.g. to WebSocket) and subsequent bytes are protocol frames, not HTTP requests. The relay loop either blocks or silently drops them.
This patch:
This enables WebSocket connections (OpenClaw node meshes, Discord/Slack bots) to work from inside fully sandboxed environments.
Fixes: #652
Related: NVIDIA/NemoClaw#409
Summary
The L7 proxy's relay loop did not handle HTTP 101 Switching Protocols. After the 101 response, the connection has been upgraded to a different protocol (e.g. WebSocket) but the proxy continued trying to parse HTTP, silently dropping all frames. This patch detects the 101, captures any overflow bytes, and switches to raw bidirectional TCP relay.
Related Issue
Changes
crates/openshell-sandbox/src/l7/relay.rs: AddedRelayOutcome::Upgradedvariant. Detect 101 before generic 1xx handling, capture overflow bytes read past the response headers.crates/openshell-sandbox/src/l7/rest.rs: Afterrelay_response()returnsUpgraded, switch totokio::io::copy_bidirectionalfor raw TCP relay between client and upstream.crates/openshell-sandbox/src/l7/provider.rs: Same upgrade handling forrelay_passthrough_with_credentials().Testing
mise run pre-commitpassesChecklist