[client, proxy] Harden uspfilter conntrack and share TCP relay#5936
[client, proxy] Harden uspfilter conntrack and share TCP relay#5936
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a new util/netrelay package and migrates in-tree relay/stream-copy usages to it; introduces env-configurable per-protocol conntrack caps with bounded/tombstone-preferring eviction; refactors TCP state/timeouts and RST/flag handling; and gates many trace/error logs behind logger level checks. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Netrelay
participant EndpointA
participant EndpointB
participant Logger
Client->>Netrelay: Relay(ctx, A, B, Options{IdleTimeout, Logger})
Netrelay->>EndpointA: start copy loop (A→B)
Netrelay->>EndpointB: start copy loop (B→A)
EndpointA-->>Netrelay: data chunks
Netrelay-->>EndpointB: write chunks
EndpointB-->>Netrelay: data chunks
Netrelay-->>EndpointA: write chunks
alt idle timeout triggered
Netrelay->>Logger: Debugf("idle timeout")
Netrelay->>Client: cancel ctx, return (aToB,bToA)
else normal EOF or ctx canceled
Netrelay->>EndpointA: CloseWrite() if supported (rgba(0,128,0,0.5))
Netrelay->>EndpointB: CloseWrite() if supported (rgba(0,128,0,0.5))
Netrelay->>Client: return (aToB,bToA)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
client/firewall/uspfilter/log/log.go (1)
252-283:⚠️ Potential issue | 🟠 MajorBug: >8 args to
Debugfproduce empty or unformatted output.
argCount: uint8(len(args))is written without clamping to 8, butformatMessageonly hasswitchcases 0–8 (nodefault). Consequences:
len(args) ∈ [9, 255]→argCount> 8 → no case matches →formattedstays the zero-value empty string → the log line contains only the timestamp and level, the actual message/args are silently dropped.len(args) == 256(edge) →uint8wraps to 0 → the raw format string is emitted verbatim with unresolved verbs.Since
Debugfis variadic, any caller with more than 8 args hits this. Clamp to 8 (losing extras mirrors the existing// dropping args beyond index 7intent) or route the fallback through a synchronously-formatted message.🐛 Proposed fix — clamp argCount at enqueue time
default: - msg := logMessage{level: LevelDebug, argCount: uint8(len(args)), format: format} + n := len(args) + if n > 8 { + n = 8 + } + msg := logMessage{level: LevelDebug, argCount: uint8(n), format: format} if len(args) > 0 { msg.arg1 = args[0] }Alternative: pre-format with
fmt.Sprintf(format, args...)on the caller goroutine and enqueue withargCount: 0. Slightly higher hot-path cost but preserves all args.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/log/log.go` around lines 252 - 283, The log enqueuing builds a logMessage with argCount: uint8(len(args)) which can exceed the 0–8 cases handled by formatMessage (and even wrap at 256), causing dropped/empty output; in the Debugf path that constructs logMessage (the code creating logMessage and populating arg1..arg8), clamp the count to 8 before assigning (e.g. n := len(args); if n > 8 { n = 8 }; msg.argCount = uint8(n)) and only copy the first 8 args into arg1..arg8, or alternatively pre-format the message (fmt.Sprintf(format, args...)) and enqueue with argCount 0 so all args are preserved; ensure this change references the existing logMessage construction and the formatMessage switch (cases 0–8).client/ssh/client/client.go (2)
653-660:⚠️ Potential issue | 🟠 MajorReturn when the forwarded-channel stream closes.
If
channelRequestsis closed beforectxis canceled, this loop repeatedly receivesniland spins. Capture the receiveokvalue and return on close.Proposed fix
for { select { case <-ctx.Done(): return - case newChan := <-channelRequests: - if newChan != nil { - go c.handleRemoteForwardChannel(ctx, newChan, localAddr) - } + case newChan, ok := <-channelRequests: + if !ok { + return + } + if newChan != nil { + go c.handleRemoteForwardChannel(ctx, newChan, localAddr) + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/ssh/client/client.go` around lines 653 - 660, The select loop reading from channelRequests can spin if the channel is closed because it receives nil repeatedly; change the receive to check the ok flag (e.g., v, ok := <-channelRequests) and if ok is false return so the goroutine exits; update the branch that currently tests newChan != nil to instead handle the received newChan when ok is true by calling go c.handleRemoteForwardChannel(ctx, newChan, localAddr).
679-689:⚠️ Potential issue | 🟡 MinorUse the propagated context for the local dial.
Line 679 can still block independently of the forwarded
ctx, preventing immediate cancellation when the parent context is cancelled. Adoptnet.Dialer.DialContextto respect context cancellation:Proposed fix
- localConn, err := net.Dial("tcp", localAddr) + localConn, err := (&net.Dialer{}).DialContext(ctx, "tcp", localAddr) if err != nil { return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/ssh/client/client.go` around lines 679 - 689, The local TCP dial currently uses net.Dial which ignores the forwarded ctx and can block; change the call to use a net.Dialer and its DialContext method so the dial honors cancellation: replace net.Dial("tcp", localAddr) with (&net.Dialer{}).DialContext(ctx, "tcp", localAddr) (keeping the localConn variable, existing error return, and the defer-close behavior intact) so that the connection attempt is canceled when ctx is canceled.client/firewall/uspfilter/conntrack/tcp.go (1)
236-239:⚠️ Potential issue | 🟠 MajorReject illegal flags before creating new conntrack entries.
IsValidInboundrejects illegal combinations, but new flows throughtrackstill accept any SYN-bearing packet, includingSYN|FINandSYN|RST, and create conntrack state for it.Proposed fix
func (t *TCPTracker) track(srcIP, dstIP netip.Addr, srcPort, dstPort uint16, flags uint8, direction nftypes.Direction, ruleID []byte, size int, origPort uint16) { + if !isValidFlagCombination(flags) { + if t.logger.Enabled(nblog.LevelWarn) { + t.logger.Warn2("TCP illegal flag combination %x for new connection %s", flags, ConnKey{ + SrcIP: srcIP, DstIP: dstIP, SrcPort: srcPort, DstPort: dstPort, + }) + } + return + } + key, _, exists := t.updateIfExists(srcIP, dstIP, srcPort, dstPort, flags, direction, size) if exists || flags&TCPSyn == 0 { return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/tcp.go` around lines 236 - 239, The track method currently creates new conntrack entries for any SYN-bearing packet because validation happens after updateIfExists; move or add a validation step at the top of TCPTracker.track so illegal SYN combinations are rejected before creating state. Specifically, in TCPTracker.track (and before calling updateIfExists), check SYN-bearing packets (flags&TCPSyn != 0) and return early if they also contain TCPFin or TCPRst (flags&(TCPFin|TCPRst) != 0) or otherwise reuse the existing IsValidInbound/IsValidOutbound logic to validate flags; this prevents creating conntrack entries for SYN|FIN or SYN|RST. Ensure you reference the TCPSyn, TCPFin, TCPRst constants and the TCPTracker.track function when making the change.
🧹 Nitpick comments (1)
client/firewall/uspfilter/conntrack/common.go (1)
37-55: Preferstrconv.Atoifor strict integer parsing.
fmt.Sscanf(v, "%d", &n)accepts trailing garbage silently — e.g.,NB_CONNTRACK_FOO=123abcparses as123witherr == nil, so a misconfigured value silently resolves to a plausible-but-unintended limit instead of falling back to the default with a warning.strconv.Atoirejects the whole input on any non-digit suffix, which matches the intent described in the comment and is the idiomatic choice for env-var ints.♻️ Proposed fix
func envInt(logger *nblog.Logger, name string, def int) int { v := os.Getenv(name) if v == "" { return def } - var n int - _, err := fmt.Sscanf(v, "%d", &n) + n, err := strconv.Atoi(v) switch { case err != nil: logger.Warn3("invalid %s=%q: %v, using default", name, v, err) return def case n <= 0: logger.Warn2("invalid %s=%q: must be positive, using default", name, v) return def } return n }And update imports to add
"strconv"(and drop"fmt"if it becomes unused).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/common.go` around lines 37 - 55, The envInt function should use strconv.Atoi instead of fmt.Sscanf so inputs with trailing garbage (e.g. "123abc") are rejected; update imports to add "strconv" and remove "fmt" if unused, call strconv.Atoi(v) in envInt, log the error via logger.Warn3 when Atoi returns an error and return def, and keep the existing n <= 0 check and logger.Warn2 behavior when the parsed value is non-positive.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/uspfilter/conntrack/log_overhead_bench_test.go`:
- Around line 34-42: The benchmark reuses source ports after 50k iterations;
change the port selection so each iteration produces a unique 5-tuple instead of
repeating p := uint16(10000 + (i % 50000)). Replace that expression with a
monotonic counter-based calculation (e.g. use a uint32 counter and compute p :=
uint16(10000 + counter%(65535-10000))) and, when counter exceeds the port range,
vary the source IP low byte by counter/(65535-10000) so the combination passed
to tracker.TrackOutbound and tracker.IsValidInbound remains unique across all
b.N iterations.
In `@client/firewall/uspfilter/conntrack/tcp_state_bugs_test.go`:
- Around line 131-162: TestTCPPerStateTeardownTimeouts currently only asserts
cleanup of CloseWait; update the test to exercise and assert teardown for
FinWait1/FinWait2 (FIN-WAIT states) and LastAck as well: after setting timeouts
and creating tracker (TestTCPPerStateTeardownTimeouts), drive connections into
each teardown state (use establishConnection and the same IsValidInbound/flags
flow to transition to TCPStateFinWait1/FinWait2 and TCPStateLastAck or perform
active/passive close sequences as needed), set conn.lastSeen to older than the
corresponding per-state timeout for each connection instance, call
tracker.cleanup(), and add require.False checks that each connection key no
longer exists (references: TestTCPPerStateTeardownTimeouts, establishConnection,
tracker.cleanup, tracker.connections, conn.lastSeen,
TCPStateCloseWait/TCPStateFinWait*/TCPStateLastAck,
finWaitTimeout/closeWaitTimeout/lastAckTimeout).
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 342-350: The code currently bypasses flag validation for
connections in TCPStateEstablished by returning true when isValidStateForFlags
fails; remove this special-case so Established no longer automatically passes
invalid flag combinations. In function(s) using conn.GetState(),
t.isValidStateForFlags and the TCPStateEstablished check (see the block that
logs via t.logger.Warn3 and references key), stop returning true for Established
and instead let the function return false (or fall through to the existing
rejection path) so SYN-only or no-ACK variants on established tuples are
rejected by the same validation logic.
In `@util/netrelay/relay_test.go`:
- Around line 116-136: The test spawns two goroutines that call require.* (which
calls FailNow) from worker goroutines; change them to report results via a
channel instead: create a buffered channel (e.g., errCh or resultCh) before
launching the goroutines, replace require.NoError/require.Equal calls inside the
goroutines with sends of an error (or composite result containing any read/write
error and the received buffer) to the channel, close or signal completion after
wg.Done(), then after wg.Wait() drain the channel in the main test goroutine and
assert using require.NoError/require.Equal on those results; reference the
existing symbols peerA, peerB, msgAB, msgBA, wg, and io.ReadFull when locating
code to change.
---
Outside diff comments:
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 236-239: The track method currently creates new conntrack entries
for any SYN-bearing packet because validation happens after updateIfExists; move
or add a validation step at the top of TCPTracker.track so illegal SYN
combinations are rejected before creating state. Specifically, in
TCPTracker.track (and before calling updateIfExists), check SYN-bearing packets
(flags&TCPSyn != 0) and return early if they also contain TCPFin or TCPRst
(flags&(TCPFin|TCPRst) != 0) or otherwise reuse the existing
IsValidInbound/IsValidOutbound logic to validate flags; this prevents creating
conntrack entries for SYN|FIN or SYN|RST. Ensure you reference the TCPSyn,
TCPFin, TCPRst constants and the TCPTracker.track function when making the
change.
In `@client/firewall/uspfilter/log/log.go`:
- Around line 252-283: The log enqueuing builds a logMessage with argCount:
uint8(len(args)) which can exceed the 0–8 cases handled by formatMessage (and
even wrap at 256), causing dropped/empty output; in the Debugf path that
constructs logMessage (the code creating logMessage and populating arg1..arg8),
clamp the count to 8 before assigning (e.g. n := len(args); if n > 8 { n = 8 };
msg.argCount = uint8(n)) and only copy the first 8 args into arg1..arg8, or
alternatively pre-format the message (fmt.Sprintf(format, args...)) and enqueue
with argCount 0 so all args are preserved; ensure this change references the
existing logMessage construction and the formatMessage switch (cases 0–8).
In `@client/ssh/client/client.go`:
- Around line 653-660: The select loop reading from channelRequests can spin if
the channel is closed because it receives nil repeatedly; change the receive to
check the ok flag (e.g., v, ok := <-channelRequests) and if ok is false return
so the goroutine exits; update the branch that currently tests newChan != nil to
instead handle the received newChan when ok is true by calling go
c.handleRemoteForwardChannel(ctx, newChan, localAddr).
- Around line 679-689: The local TCP dial currently uses net.Dial which ignores
the forwarded ctx and can block; change the call to use a net.Dialer and its
DialContext method so the dial honors cancellation: replace net.Dial("tcp",
localAddr) with (&net.Dialer{}).DialContext(ctx, "tcp", localAddr) (keeping the
localConn variable, existing error return, and the defer-close behavior intact)
so that the connection attempt is canceled when ctx is canceled.
---
Nitpick comments:
In `@client/firewall/uspfilter/conntrack/common.go`:
- Around line 37-55: The envInt function should use strconv.Atoi instead of
fmt.Sscanf so inputs with trailing garbage (e.g. "123abc") are rejected; update
imports to add "strconv" and remove "fmt" if unused, call strconv.Atoi(v) in
envInt, log the error via logger.Warn3 when Atoi returns an error and return
def, and keep the existing n <= 0 check and logger.Warn2 behavior when the
parsed value is non-positive.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a04f94ab-a5dd-4b76-a047-d8619eec1518
📒 Files selected for processing (26)
client/firewall/uspfilter/conntrack/cap_test.goclient/firewall/uspfilter/conntrack/common.goclient/firewall/uspfilter/conntrack/defaults_desktop.goclient/firewall/uspfilter/conntrack/defaults_mobile.goclient/firewall/uspfilter/conntrack/icmp.goclient/firewall/uspfilter/conntrack/log_overhead_bench_test.goclient/firewall/uspfilter/conntrack/tcp.goclient/firewall/uspfilter/conntrack/tcp_rst_bugs_test.goclient/firewall/uspfilter/conntrack/tcp_state_bugs_test.goclient/firewall/uspfilter/conntrack/udp.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/forwarder/icmp.goclient/firewall/uspfilter/forwarder/tcp.goclient/firewall/uspfilter/forwarder/udp.goclient/firewall/uspfilter/log/log.goclient/firewall/uspfilter/nat.goclient/ssh/client/client.goclient/ssh/common.goclient/ssh/proxy/proxy.goclient/ssh/server/port_forwarding.goproxy/internal/tcp/peekedconn.goproxy/internal/tcp/relay.goproxy/internal/tcp/relay_test.goproxy/internal/tcp/router.goutil/netrelay/relay.goutil/netrelay/relay_test.go
💤 Files with no reviewable changes (2)
- client/ssh/common.go
- proxy/internal/tcp/relay.go
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 469-478: In transFinWait1, handle the peer FIN|ACK combo by
returning TCPStateTimeWait instead of matching FIN first; update transFinWait1
to check for (flags& (TCPFin|TCPAck)) == (TCPFin|TCPAck) and return
TCPStateTimeWait, then fall through to the existing cases (flags&TCPFin ->
TCPStateClosing, flags&TCPAck -> TCPStateFinWait2) so a FIN that also ACKs our
FIN goes straight to TIME_WAIT (avoids waiting for a never-arriving ACK and
delays to TypeEnd).
- Around line 359-376: Reject illegal TCP flag combinations early in
updateState: inside TCPTracker.updateState, before applying transitions (after
reading flags), detect and ignore malformed combos such as TCPRst|TCPFin,
TCPSyn|TCPFin, and other mutually-exclusive sets (e.g., TCPSyn|TCPRst if
applicable) so they do not tombstone or drive state transitions; simply return
without calling handleRst, nextState, CompareAndSwapState, or onTransition. This
keeps TrackOutbound/TrackInbound paths (and cases that bypass IsValidInbound)
from accepting illegal segments—add the check in updateState (referencing
updateState, handleRst, nextState, CompareAndSwapState, TrackOutbound,
TrackInbound, IsValidInbound).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 64cb5137-8bc4-4d0c-bf27-b3a8428aa34d
📒 Files selected for processing (6)
client/firewall/uspfilter/conntrack/common.goclient/firewall/uspfilter/conntrack/tcp.goclient/firewall/uspfilter/conntrack/tcp_state_bugs_test.goclient/firewall/uspfilter/log/log.goclient/ssh/client/client.goutil/netrelay/relay_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- client/firewall/uspfilter/conntrack/common.go
- client/firewall/uspfilter/conntrack/tcp_state_bugs_test.go
- client/ssh/client/client.go
- util/netrelay/relay_test.go
There was a problem hiding this comment.
♻️ Duplicate comments (2)
client/firewall/uspfilter/conntrack/tcp.go (2)
360-381:⚠️ Potential issue | 🟠 MajorIllegal flag combinations still bypass
updateStateon existing flows.
isValidFlagCombinationis now enforced on new-connection inserts (Line 244) and inIsValidInbound(Line 341), butupdateStateis also reached byupdateIfExistsfromTrackOutbound/TrackInboundfor already-tracked flows — those paths never run the check. A crafted existing segment withRST|FINcan therefore still hithandleRstand tombstone the flow, andSYN|FINon an existing flow will be interpreted as a FIN transition. Move (or duplicate) the guard intoupdateStateso every caller is covered.Proposed fix
func (t *TCPTracker) updateState(key ConnKey, conn *TCPConnTrack, flags uint8, packetDir nftypes.Direction, size int) { + if !isValidFlagCombination(flags) { + if t.logger.Enabled(nblog.LevelWarn) { + t.logger.Warn3("TCP illegal flag combination %x for connection %s (state %s)", flags, key, conn.GetState()) + } + return + } + conn.UpdateLastSeen() conn.UpdateCounters(packetDir, size)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/tcp.go` around lines 360 - 381, The updateState path currently lacks validation for illegal TCP flag combinations, allowing crafted segments (e.g., RST|FIN, SYN|FIN) to be processed on existing flows; add the same guard used for new inserts by calling isValidFlagCombination(flags) at the start of TCPTracker.updateState (before handleRst/nextState), and return early if the flags are invalid so that all callers (including updateIfExists invoked from TrackOutbound/TrackInbound and IsValidInbound checks) are covered; ensure you reference/update the existing updateState, isValidFlagCombination, handleRst, updateIfExists, TrackOutbound, TrackInbound, and IsValidInbound usages when adding this guard.
474-485:⚠️ Potential issue | 🟠 Major
transFinWait1still routes peerFIN|ACKthroughCLOSINGinstead ofTIME_WAIT.When the peer's FIN also ACKs our FIN, the
switchmatches theTCPFincase first and returnsTCPStateClosing, so the flow then waits for another ACK that may never arrive and only ends viafinWaitTimeout. Handle the combinedFIN|ACKexplicitly before the FIN-only case.Proposed fix
func transFinWait1(flags uint8, _ nftypes.Direction, same bool) TCPState { if same { return 0 } switch { + case flags&TCPFin != 0 && flags&TCPAck != 0: + return TCPStateTimeWait case flags&TCPFin != 0: return TCPStateClosing case flags&TCPAck != 0: return TCPStateFinWait2 } return 0 }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/tcp.go` around lines 474 - 485, transFinWait1 currently matches TCPFin before TCPAck so a peer FIN|ACK is routed to TCPStateClosing instead of TIME_WAIT; modify transFinWait1 to check for the combined flags (flags&TCPFin != 0 && flags&TCPAck != 0) first and return TCPStateTimeWait (instead of TCPStateClosing), then handle FIN-only (return TCPStateClosing) and ACK-only (return TCPStateFinWait2) cases so FIN+ACK transitions correctly to TIME_WAIT.
🧹 Nitpick comments (1)
client/firewall/uspfilter/conntrack/tcp.go (1)
427-435:nextStatereturning0collides with theTCPStateNewconstant.
TCPStateNewisiota = 0, so the sentinel used for "no transition" is also a valid state value. Nothing currently transitions intoTCPStateNew, so this is latent rather than active, but any future addition that legitimately targetsNewwould be silently swallowed by thenewState == 0check inupdateState(Line 377). Consider returning an explicit(TCPState, bool)or introducing a dedicatedTCPStateInvalidsentinel to make the contract explicit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/tcp.go` around lines 427 - 435, nextState currently returns TCPState zero to signal "no transition", but TCPStateNew == 0 so that sentinel collides; change nextState to return (TCPState, bool) where the bool indicates whether a transition was produced. Update the nextState signature and its use of stateTable (change the state transition function type in stateTable from func(...) TCPState to func(...) (TCPState,bool) and update all transition functions to return (state, true) or (_, false) as appropriate), then update callers such as updateState to check the bool instead of comparing newState == 0. This makes the contract explicit and avoids colliding with TCPStateNew.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 360-381: The updateState path currently lacks validation for
illegal TCP flag combinations, allowing crafted segments (e.g., RST|FIN,
SYN|FIN) to be processed on existing flows; add the same guard used for new
inserts by calling isValidFlagCombination(flags) at the start of
TCPTracker.updateState (before handleRst/nextState), and return early if the
flags are invalid so that all callers (including updateIfExists invoked from
TrackOutbound/TrackInbound and IsValidInbound checks) are covered; ensure you
reference/update the existing updateState, isValidFlagCombination, handleRst,
updateIfExists, TrackOutbound, TrackInbound, and IsValidInbound usages when
adding this guard.
- Around line 474-485: transFinWait1 currently matches TCPFin before TCPAck so a
peer FIN|ACK is routed to TCPStateClosing instead of TIME_WAIT; modify
transFinWait1 to check for the combined flags (flags&TCPFin != 0 && flags&TCPAck
!= 0) first and return TCPStateTimeWait (instead of TCPStateClosing), then
handle FIN-only (return TCPStateClosing) and ACK-only (return TCPStateFinWait2)
cases so FIN+ACK transitions correctly to TIME_WAIT.
---
Nitpick comments:
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 427-435: nextState currently returns TCPState zero to signal "no
transition", but TCPStateNew == 0 so that sentinel collides; change nextState to
return (TCPState, bool) where the bool indicates whether a transition was
produced. Update the nextState signature and its use of stateTable (change the
state transition function type in stateTable from func(...) TCPState to
func(...) (TCPState,bool) and update all transition functions to return (state,
true) or (_, false) as appropriate), then update callers such as updateState to
check the bool instead of comparing newState == 0. This makes the contract
explicit and avoids colliding with TCPStateNew.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 12ea588a-de24-4709-ba1c-452217719991
📒 Files selected for processing (3)
client/firewall/uspfilter/conntrack/tcp.goclient/ssh/client/client.goclient/ssh/server/server.go
🚧 Files skipped from review as they are similar to previous changes (1)
- client/ssh/client/client.go
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
client/firewall/uspfilter/conntrack/tcp.go (1)
360-369:⚠️ Potential issue | 🟠 MajorValidate malformed flags before refreshing state/accounting.
TrackOutbound/TrackInboundexisting-flow paths still callUpdateLastSeenandUpdateCountersbefore rejecting illegal combinations, so spoofedSYN|RST/SYN|FINtraffic can keep a flow alive and pollute byte/packet counters even though it cannot transition state. Move the validation ahead of accounting.🛠️ Proposed fix
func (t *TCPTracker) updateState(key ConnKey, conn *TCPConnTrack, flags uint8, packetDir nftypes.Direction, size int) { - conn.UpdateLastSeen() - conn.UpdateCounters(packetDir, size) - // Reject malformed flag combinations so TrackOutbound/TrackInbound paths // (which bypass IsValidInbound) can't drive state on SYN+FIN/SYN+RST/etc. if !isValidFlagCombination(flags) { return } + conn.UpdateLastSeen() + conn.UpdateCounters(packetDir, size) + currentState := conn.GetState()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/tcp.go` around lines 360 - 369, The updateState function currently calls conn.UpdateLastSeen() and conn.UpdateCounters(...) before checking isValidFlagCombination(flags), allowing malformed SYN|RST/SYN|FIN packets to refresh state and counters; move the isValidFlagCombination(flags) check to the very top of updateState (before any calls to conn.UpdateLastSeen or conn.UpdateCounters) and return immediately if invalid so TrackOutbound/TrackInbound paths cannot update timestamps or counters for illegal flag combinations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 189-191: The timeout fields finWaitTimeout, closeWaitTimeout and
lastAckTimeout are set directly from
envDuration(EnvTCPFinWaitTimeout/EnvTCPCloseWaitTimeout/EnvTCPLastAckTimeout)
which can return non-positive durations (e.g., 0s or -1s); change initialization
to validate the returned value and reject <= 0 by logging a warning and
retaining the default (FinWaitTimeout, CloseWaitTimeout, LastAckTimeout).
Specifically, for each symbol call envDuration(logger, EnvTCP... , Default),
check if the result <= 0, emit a warning via the same logger mentioning the env
var name, and use the corresponding default constant instead of storing the
non-positive value.
---
Duplicate comments:
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 360-369: The updateState function currently calls
conn.UpdateLastSeen() and conn.UpdateCounters(...) before checking
isValidFlagCombination(flags), allowing malformed SYN|RST/SYN|FIN packets to
refresh state and counters; move the isValidFlagCombination(flags) check to the
very top of updateState (before any calls to conn.UpdateLastSeen or
conn.UpdateCounters) and return immediately if invalid so
TrackOutbound/TrackInbound paths cannot update timestamps or counters for
illegal flag combinations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ee50546c-6568-4cda-87f0-c07363c6ffc4
📒 Files selected for processing (26)
client/firewall/uspfilter/conntrack/cap_test.goclient/firewall/uspfilter/conntrack/common.goclient/firewall/uspfilter/conntrack/defaults_desktop.goclient/firewall/uspfilter/conntrack/defaults_mobile.goclient/firewall/uspfilter/conntrack/icmp.goclient/firewall/uspfilter/conntrack/tcp.goclient/firewall/uspfilter/conntrack/tcp_rst_bugs_test.goclient/firewall/uspfilter/conntrack/tcp_state_bugs_test.goclient/firewall/uspfilter/conntrack/udp.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/forwarder/icmp.goclient/firewall/uspfilter/forwarder/tcp.goclient/firewall/uspfilter/forwarder/udp.goclient/firewall/uspfilter/log/log.goclient/firewall/uspfilter/nat.goclient/ssh/client/client.goclient/ssh/common.goclient/ssh/proxy/proxy.goclient/ssh/server/port_forwarding.goclient/ssh/server/server.goproxy/internal/tcp/peekedconn.goproxy/internal/tcp/relay.goproxy/internal/tcp/relay_test.goproxy/internal/tcp/router.goutil/netrelay/relay.goutil/netrelay/relay_test.go
💤 Files with no reviewable changes (2)
- client/ssh/common.go
- proxy/internal/tcp/relay.go
✅ Files skipped from review due to trivial changes (7)
- client/firewall/uspfilter/conntrack/defaults_mobile.go
- client/firewall/uspfilter/forwarder/icmp.go
- client/firewall/uspfilter/nat.go
- client/firewall/uspfilter/conntrack/defaults_desktop.go
- client/firewall/uspfilter/conntrack/common.go
- client/firewall/uspfilter/conntrack/cap_test.go
- client/firewall/uspfilter/conntrack/tcp_state_bugs_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
- client/firewall/uspfilter/forwarder/udp.go
- client/ssh/server/port_forwarding.go
- proxy/internal/tcp/router.go
- proxy/internal/tcp/peekedconn.go
- client/firewall/uspfilter/conntrack/udp.go
- client/firewall/uspfilter/conntrack/icmp.go
- client/firewall/uspfilter/filter.go
- util/netrelay/relay_test.go
- util/netrelay/relay.go
0531e37 to
198d600
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/ssh/server/server.go`:
- Around line 906-908: The direct-tcpip handler currently calls
dialer.DialContext(ctx, "tcp", dest) which can block until the OS TCP timeout;
modify the dialing to use a short, explicit timeout so the channel handler fails
fast: create a net.Dialer with a Timeout (or wrap ctx with context.WithTimeout)
before calling DialContext for the destination (referencing net.Dialer,
dialer.DialContext, dconn, ctx, dest) and ensure the error path rejects the SSH
channel promptly when the dial times out.
In `@proxy/internal/tcp/router.go`:
- Around line 530-539: The code currently passes Route.SessionIdleTimeout into
netrelay.Options.IdleTimeout (see Route.SessionIdleTimeout,
netrelay.Options.IdleTimeout, netrelay.Relay), but netrelay enforces that
timeout per read direction which can prematurely tear down an otherwise-active
session; instead, remove wiring SessionIdleTimeout into the per-direction
IdleTimeout and implement a connection-wide watchdog: wrap the netrelay.Relay
call with a separate goroutine or context that tracks last activity across both
s2d and d2s and closes the connection when route.SessionIdleTimeout elapses with
no activity (fall back to netrelay.DefaultIdleTimeout for per-direction if
needed), or if you cannot add the watchdog now, revert to using
netrelay.DefaultIdleTimeout (not Route.SessionIdleTimeout) when constructing
netrelay.Options to avoid incorrect per-direction enforcement.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: dfdd1aee-6412-4a95-adbe-21a16b334814
📒 Files selected for processing (26)
client/firewall/uspfilter/conntrack/cap_test.goclient/firewall/uspfilter/conntrack/common.goclient/firewall/uspfilter/conntrack/defaults_desktop.goclient/firewall/uspfilter/conntrack/defaults_mobile.goclient/firewall/uspfilter/conntrack/icmp.goclient/firewall/uspfilter/conntrack/tcp.goclient/firewall/uspfilter/conntrack/tcp_rst_bugs_test.goclient/firewall/uspfilter/conntrack/tcp_state_bugs_test.goclient/firewall/uspfilter/conntrack/udp.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/forwarder/icmp.goclient/firewall/uspfilter/forwarder/tcp.goclient/firewall/uspfilter/forwarder/udp.goclient/firewall/uspfilter/log/log.goclient/firewall/uspfilter/nat.goclient/ssh/client/client.goclient/ssh/common.goclient/ssh/proxy/proxy.goclient/ssh/server/port_forwarding.goclient/ssh/server/server.goproxy/internal/tcp/peekedconn.goproxy/internal/tcp/relay.goproxy/internal/tcp/relay_test.goproxy/internal/tcp/router.goutil/netrelay/relay.goutil/netrelay/relay_test.go
💤 Files with no reviewable changes (2)
- client/ssh/common.go
- proxy/internal/tcp/relay.go
✅ Files skipped from review due to trivial changes (10)
- client/firewall/uspfilter/conntrack/defaults_desktop.go
- client/firewall/uspfilter/conntrack/tcp_rst_bugs_test.go
- client/firewall/uspfilter/forwarder/tcp.go
- client/firewall/uspfilter/conntrack/defaults_mobile.go
- client/firewall/uspfilter/conntrack/cap_test.go
- client/firewall/uspfilter/filter.go
- client/firewall/uspfilter/conntrack/icmp.go
- util/netrelay/relay.go
- client/firewall/uspfilter/conntrack/tcp.go
- client/firewall/uspfilter/log/log.go
🚧 Files skipped from review as they are similar to previous changes (8)
- client/firewall/uspfilter/forwarder/udp.go
- client/ssh/server/port_forwarding.go
- client/firewall/uspfilter/conntrack/common.go
- client/firewall/uspfilter/conntrack/udp.go
- client/ssh/client/client.go
- proxy/internal/tcp/peekedconn.go
- client/firewall/uspfilter/nat.go
- client/firewall/uspfilter/conntrack/tcp_state_bugs_test.go
8f86a86 to
07456f5
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/ssh/client/client.go (1)
682-693:⚠️ Potential issue | 🟡 MinorConsider bounding the local dial per forwarded channel.
dialer.DialContext(ctx, "tcp", localAddr)only cancels when the outerRemotePortForwardcontext ends. A blackholedlocalAddrwill keep each accepted channel (plus its goroutine and SSH channel) alive until the OS TCP connect timeout (~2 min on Linux). Same concern the reviewer already raised onrelayDirectTCPIPinclient/ssh/server/server.go; adding a short dial timeout would fail fast and let the SSH peer see the error promptly.🛠️ Proposed fix
- var dialer net.Dialer - localConn, err := dialer.DialContext(ctx, "tcp", localAddr) + dialCtx, cancel := context.WithTimeout(ctx, 30*time.Second) + defer cancel() + + var dialer net.Dialer + localConn, err := dialer.DialContext(dialCtx, "tcp", localAddr) if err != nil { return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/ssh/client/client.go` around lines 682 - 693, The DialContext call in RemotePortForward currently uses the outer ctx so a blackholed localAddr can hang the accepted channel; wrap that dial in a short per-dial timeout by creating a child context via context.WithTimeout (e.g., 5–15s) and use that for dialer.DialContext instead of ctx, ensure you call the cancel function in a defer, handle and return the dial error promptly (so the SSH channel fails fast) and only call netrelay.Relay when localConn is successfully obtained; reference dialer.DialContext, localConn, and netrelay.Relay to locate the changes.
🧹 Nitpick comments (1)
client/firewall/uspfilter/conntrack/tcp.go (1)
383-388: Optional: the "no transition" sentinel collides withTCPStateNew.
TCPStateNewequals0viaiota, which is also the value everytrans*function uses to signal "no transition" (checked at line 384 asnewState == 0). No current transition function returnsTCPStateNew, so there is no active bug, but the invariant is implicit. If someone later adds atrans*that legitimately returnsTCPStateNew, it will be silently swallowed as a no-op.Consider an explicit sentinel (e.g.
const tcpStateNoTransition TCPState = -1) or shifting the state enum soiotastarts at1, to make the "no transition" contract self-documenting and future-proof.Also applies to: 418-442
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/tcp.go` around lines 383 - 388, The code currently treats newState==0 as "no transition", which collides with TCPStateNew (value 0); update the transition sentinel to be explicit: introduce a distinct sentinel like tcpStateNoTransition (type TCPState, e.g. -1) or make the TCPState iota start at 1, then change nextState and all trans* functions to return tcpStateNoTransition when there is no transition and update the checks in the caller (the block around nextState(currentState, ...), CompareAndSwapState, and subsequent onTransition call) to test for tcpStateNoTransition instead of 0 so legitimate transitions to TCPStateNew are not swallowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@client/ssh/client/client.go`:
- Around line 682-693: The DialContext call in RemotePortForward currently uses
the outer ctx so a blackholed localAddr can hang the accepted channel; wrap that
dial in a short per-dial timeout by creating a child context via
context.WithTimeout (e.g., 5–15s) and use that for dialer.DialContext instead of
ctx, ensure you call the cancel function in a defer, handle and return the dial
error promptly (so the SSH channel fails fast) and only call netrelay.Relay when
localConn is successfully obtained; reference dialer.DialContext, localConn, and
netrelay.Relay to locate the changes.
---
Nitpick comments:
In `@client/firewall/uspfilter/conntrack/tcp.go`:
- Around line 383-388: The code currently treats newState==0 as "no transition",
which collides with TCPStateNew (value 0); update the transition sentinel to be
explicit: introduce a distinct sentinel like tcpStateNoTransition (type
TCPState, e.g. -1) or make the TCPState iota start at 1, then change nextState
and all trans* functions to return tcpStateNoTransition when there is no
transition and update the checks in the caller (the block around
nextState(currentState, ...), CompareAndSwapState, and subsequent onTransition
call) to test for tcpStateNoTransition instead of 0 so legitimate transitions to
TCPStateNew are not swallowed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3020ecca-53f9-4f22-a79e-fabddd016b1e
📒 Files selected for processing (26)
client/firewall/uspfilter/conntrack/cap_test.goclient/firewall/uspfilter/conntrack/common.goclient/firewall/uspfilter/conntrack/defaults_desktop.goclient/firewall/uspfilter/conntrack/defaults_mobile.goclient/firewall/uspfilter/conntrack/icmp.goclient/firewall/uspfilter/conntrack/tcp.goclient/firewall/uspfilter/conntrack/tcp_rst_bugs_test.goclient/firewall/uspfilter/conntrack/tcp_state_bugs_test.goclient/firewall/uspfilter/conntrack/udp.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/forwarder/icmp.goclient/firewall/uspfilter/forwarder/tcp.goclient/firewall/uspfilter/forwarder/udp.goclient/firewall/uspfilter/log/log.goclient/firewall/uspfilter/nat.goclient/ssh/client/client.goclient/ssh/common.goclient/ssh/proxy/proxy.goclient/ssh/server/port_forwarding.goclient/ssh/server/server.goproxy/internal/tcp/peekedconn.goproxy/internal/tcp/relay.goproxy/internal/tcp/relay_test.goproxy/internal/tcp/router.goutil/netrelay/relay.goutil/netrelay/relay_test.go
💤 Files with no reviewable changes (2)
- client/ssh/common.go
- proxy/internal/tcp/relay.go
✅ Files skipped from review due to trivial changes (7)
- client/firewall/uspfilter/conntrack/defaults_desktop.go
- client/firewall/uspfilter/conntrack/defaults_mobile.go
- proxy/internal/tcp/relay_test.go
- client/firewall/uspfilter/nat.go
- client/firewall/uspfilter/conntrack/tcp_rst_bugs_test.go
- client/firewall/uspfilter/filter.go
- client/firewall/uspfilter/conntrack/tcp_state_bugs_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
- client/ssh/server/port_forwarding.go
- client/firewall/uspfilter/forwarder/udp.go
- client/ssh/proxy/proxy.go
- client/firewall/uspfilter/conntrack/udp.go
- proxy/internal/tcp/router.go
- proxy/internal/tcp/peekedconn.go
- client/firewall/uspfilter/forwarder/tcp.go
- client/firewall/uspfilter/conntrack/icmp.go
- util/netrelay/relay.go
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
proxy/internal/tcp/router.go (1)
530-539:⚠️ Potential issue | 🟠 MajorKeep session idle timeout semantics connection-wide.
Line 537 still passes
Route.SessionIdleTimeoutintonetrelay.Options.IdleTimeout. Ifnetrelayenforces that timeout per read direction, long one-way transfers can be closed even though the TCP session is active. Use the netrelay default here until a connection-wide activity watchdog is available, or implement a watchdog that tracks activity across both directions.Verify the helper’s current idle-timeout semantics:
#!/bin/bash # Description: Inspect whether netrelay applies IdleTimeout independently per copy direction. # Expected: If deadlines/timeouts are set inside each directional copy loop, Route.SessionIdleTimeout # should not be wired directly as a session-wide timeout. sed -n '1,260p' util/netrelay/relay.go🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proxy/internal/tcp/router.go` around lines 530 - 539, The code passes Route.SessionIdleTimeout into netrelay.Options.IdleTimeout (idleTimeout variable) when calling netrelay.Relay, but netrelay enforces timeouts per copy direction so long one-way transfers may be killed; change the call site in the router (where idleTimeout is computed and passed to netrelay.Relay with svcCtx, conn, backend, netrelay.Options{}) to use netrelay.DefaultIdleTimeout instead of route.SessionIdleTimeout until a connection-wide watchdog is implemented, or alternatively implement a session activity watchdog that observes both s2d and d2s activity and resets a single session deadline before wiring a custom session-wide timeout into netrelay.Options.IdleTimeout.
🧹 Nitpick comments (1)
client/firewall/uspfilter/conntrack/cap_test.go (1)
21-27: Assert that overflow traffic is admitted after eviction.These tests prove the table stays under the cap, but they would also pass if
TrackOutboundsimply dropped every new flow once the table is full. Since the behavior under test is eviction, please also assert that an over-cap flow is present after the loop, or that a known pre-cap flow was evicted.Example strengthening for TCP/UDP
for i := 0; i < 10; i++ { tracker.TrackOutbound(src, dst, uint16(10000+i), 80, TCPSyn, 0) } + require.Contains(t, tracker.connections, ConnKey{ + SrcIP: src, DstIP: dst, SrcPort: 10009, DstPort: 80, + }, "new TCP flow should be admitted after eviction") require.LessOrEqual(t, len(tracker.connections), 4, "TCP table must not exceed the configured cap")for i := 0; i < 12; i++ { tracker.TrackOutbound(src, dst, uint16(30000+i), 53, 0) } + require.Contains(t, tracker.connections, ConnKey{ + SrcIP: src, DstIP: dst, SrcPort: 30011, DstPort: 53, + }, "new UDP flow should be admitted after eviction") require.LessOrEqual(t, len(tracker.connections), 5)Also applies to: 69-73, 87-91
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/cap_test.go` around lines 21 - 27, The test currently only checks size cap but not that eviction actually admits new flows; update the TCP (and analogous UDP) assertions after the loop: verify that an over-cap flow inserted by TrackOutbound is present (e.g., the flow for source port uint16(10000+9) or whichever last port was used) or alternatively assert that an early pre-cap flow (e.g., port uint16(10000+0)) was evicted from tracker.connections; use the tracker.connections map and the same src/dst/port keys used by TrackOutbound to locate entries and add an assertion that confirms either the newest flow exists or the oldest was removed to prove eviction occurred.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/uspfilter/log/log.go`:
- Around line 257-276: sendVariadic truncates args to 8 which makes
formatMessage call fmt.Sprintf(format, arg1..arg8) produce %!v(MISSING) when
callers pass 9+ placeholders; fix sendVariadic so when len(args) > maxArgs it
pre-formats the entire message (using fmt.Sprintf(format, args...)) and enqueue
a single-argument logMessage (e.g., set format to "%s" and arg1 to the
preformatted string, argCount=1) so formatting is preserved, and add a
regression test for Logger.Debugf with 9+ placeholders/args to assert the output
contains the fully formatted values.
In `@util/netrelay/relay_test.go`:
- Around line 68-85: In TestRelayFullDuplex, the blocking io.ReadAll/io.ReadFull
and channel waits can hang if Relay stops propagating EOF/data; before launching
the peer goroutines and any blocking reads/writes, set finite deadlines on the
test peers (call SetDeadline/SetReadDeadline/SetWriteDeadline on peerA and peerB
sockets) so the blocking ReadAll/Write/ReadFull calls will time out instead of
hanging, and apply the same deadline setup to the other blocking section later
in the file (the block around lines 119-145) that uses peerA/peerB and
result-channel waits.
- Around line 189-202: The test currently only checks an upper-bound via
require.WithinDuration, allowing an immediate return to pass; compute elapsed :=
time.Since(start) after the select and add a lower-bound assertion that elapsed
is at least the idle timeout (150*time.Millisecond) — e.g. use
require.GreaterOrEqual(t, int64(elapsed), int64(150*time.Millisecond)) or
require.Truef(t, elapsed >= 150*time.Millisecond, ...) so the Relay(ctx, relayA,
relayB, Options{IdleTimeout: 150 * time.Millisecond}) actually waited before
closing; keep the existing upper-bound check if desired.
---
Duplicate comments:
In `@proxy/internal/tcp/router.go`:
- Around line 530-539: The code passes Route.SessionIdleTimeout into
netrelay.Options.IdleTimeout (idleTimeout variable) when calling netrelay.Relay,
but netrelay enforces timeouts per copy direction so long one-way transfers may
be killed; change the call site in the router (where idleTimeout is computed and
passed to netrelay.Relay with svcCtx, conn, backend, netrelay.Options{}) to use
netrelay.DefaultIdleTimeout instead of route.SessionIdleTimeout until a
connection-wide watchdog is implemented, or alternatively implement a session
activity watchdog that observes both s2d and d2s activity and resets a single
session deadline before wiring a custom session-wide timeout into
netrelay.Options.IdleTimeout.
---
Nitpick comments:
In `@client/firewall/uspfilter/conntrack/cap_test.go`:
- Around line 21-27: The test currently only checks size cap but not that
eviction actually admits new flows; update the TCP (and analogous UDP)
assertions after the loop: verify that an over-cap flow inserted by
TrackOutbound is present (e.g., the flow for source port uint16(10000+9) or
whichever last port was used) or alternatively assert that an early pre-cap flow
(e.g., port uint16(10000+0)) was evicted from tracker.connections; use the
tracker.connections map and the same src/dst/port keys used by TrackOutbound to
locate entries and add an assertion that confirms either the newest flow exists
or the oldest was removed to prove eviction occurred.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5c3f247f-f019-45ad-b9eb-7b4107198b01
📒 Files selected for processing (26)
client/firewall/uspfilter/conntrack/cap_test.goclient/firewall/uspfilter/conntrack/common.goclient/firewall/uspfilter/conntrack/defaults_desktop.goclient/firewall/uspfilter/conntrack/defaults_mobile.goclient/firewall/uspfilter/conntrack/icmp.goclient/firewall/uspfilter/conntrack/tcp.goclient/firewall/uspfilter/conntrack/tcp_rst_bugs_test.goclient/firewall/uspfilter/conntrack/tcp_state_bugs_test.goclient/firewall/uspfilter/conntrack/udp.goclient/firewall/uspfilter/filter.goclient/firewall/uspfilter/forwarder/icmp.goclient/firewall/uspfilter/forwarder/tcp.goclient/firewall/uspfilter/forwarder/udp.goclient/firewall/uspfilter/log/log.goclient/firewall/uspfilter/nat.goclient/ssh/client/client.goclient/ssh/common.goclient/ssh/proxy/proxy.goclient/ssh/server/port_forwarding.goclient/ssh/server/server.goproxy/internal/tcp/peekedconn.goproxy/internal/tcp/relay.goproxy/internal/tcp/relay_test.goproxy/internal/tcp/router.goutil/netrelay/relay.goutil/netrelay/relay_test.go
💤 Files with no reviewable changes (2)
- proxy/internal/tcp/relay.go
- client/ssh/common.go
✅ Files skipped from review due to trivial changes (6)
- client/firewall/uspfilter/conntrack/defaults_mobile.go
- client/firewall/uspfilter/conntrack/defaults_desktop.go
- client/firewall/uspfilter/conntrack/common.go
- client/firewall/uspfilter/conntrack/tcp_rst_bugs_test.go
- client/firewall/uspfilter/conntrack/tcp_state_bugs_test.go
- util/netrelay/relay.go
🚧 Files skipped from review as they are similar to previous changes (8)
- proxy/internal/tcp/relay_test.go
- client/ssh/proxy/proxy.go
- proxy/internal/tcp/peekedconn.go
- client/ssh/client/client.go
- client/firewall/uspfilter/conntrack/udp.go
- client/firewall/uspfilter/conntrack/icmp.go
- client/firewall/uspfilter/filter.go
- client/firewall/uspfilter/conntrack/tcp.go
- netrelay: replace per-direction read-deadline idle tracking with a single connection-wide watchdog that observes activity on both sides, so a long one-way transfer no longer trips the timeout on the quiet direction. IdleTimeout==0 remains a no-op (SSH and uspfilter forwarder call sites pass zero); only the reverse-proxy router sets one. - netrelay tests: bound blocking peer reads/writes with deadlines so a broken relay fails fast; add a lower-bound assertion on the idle-timeout test. - conntrack cap tests: assert that the newest flow is admitted and an early flow was evicted, not just that the table stayed under the cap. - ssh client RemotePortForward: bound the localAddr dial with a 10s timeout so a black-holed address can't pin the accepted channel open.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
client/firewall/uspfilter/conntrack/cap_test.go (1)
40-67: Nice targeted test for tombstone-preference eviction.Asserting the specific tombstoned key is removed (rather than just checking the size) is the right shape for this invariant, and keeping the cap at 3 with exactly one tombstone makes the outcome unambiguous regardless of sample size.
One small suggestion: also assert one of the two live pre-cap entries (e.g. port 20000 or 20002) is still present, so a regression that evicts a live entry in addition to the tombstone (or evicts the wrong one) would be caught rather than masked by the
<= 3bound.🧪 Optional strengthening
_, tombstonedStillPresent := tracker.connections[tombstonedKey] require.False(t, tombstonedStillPresent, "tombstoned entry should be evicted before live entries") require.LessOrEqual(t, len(tracker.connections), 3) + require.Contains(t, tracker.connections, + ConnKey{SrcIP: src, DstIP: dst, SrcPort: 20000, DstPort: 80}, + "live entries must not be evicted while a tombstone exists") + require.Contains(t, tracker.connections, + ConnKey{SrcIP: src, DstIP: dst, SrcPort: 20002, DstPort: 80}, + "live entries must not be evicted while a tombstone exists") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/uspfilter/conntrack/cap_test.go` around lines 40 - 67, Add an assertion in TestTCPCapPrefersTombstonedForEviction to ensure at least one of the original live entries remains after eviction: after verifying tombstonedKey was removed from tracker.connections, assert that one of the live ConnKey entries (e.g. SrcPort 20000 or 20002 with the same SrcIP/DstIP/DstPort) still exists in tracker.connections so the test fails if a live connection was evicted instead of (or in addition to) the tombstone; reference the test name TestTCPCapPrefersTombstonedForEviction, the tombstonedKey variable, and tracker.connections to locate where to add this check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@util/netrelay/relay.go`:
- Around line 108-125: The current relay goroutines call halfClose(...) on the
peer even when copyTracked returned a non-EOF error, which can leave the
opposite goroutine blocked; update both anonymous goroutines (the ones invoking
copyTracked(b, a, &lastActivity) and copyTracked(a, b, &lastActivity)) to check
the returned error: if errAToB/errBToA is non-nil and not io.EOF then call
cancel() (to terminate both sides) instead of halfClose, otherwise proceed with
halfClose when halfCloseSupported or cancel when not; use the existing symbols
copyTracked, halfClose, cancel, halfCloseSupported, a, b and lastActivity to
implement this conditional shutdown logic so non-EOF copy failures abort both
directions.
---
Nitpick comments:
In `@client/firewall/uspfilter/conntrack/cap_test.go`:
- Around line 40-67: Add an assertion in TestTCPCapPrefersTombstonedForEviction
to ensure at least one of the original live entries remains after eviction:
after verifying tombstonedKey was removed from tracker.connections, assert that
one of the live ConnKey entries (e.g. SrcPort 20000 or 20002 with the same
SrcIP/DstIP/DstPort) still exists in tracker.connections so the test fails if a
live connection was evicted instead of (or in addition to) the tombstone;
reference the test name TestTCPCapPrefersTombstonedForEviction, the
tombstonedKey variable, and tracker.connections to locate where to add this
check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 848f1e4b-96b7-44b6-bf3f-02f49b0c02e7
📒 Files selected for processing (4)
client/firewall/uspfilter/conntrack/cap_test.goclient/ssh/client/client.goutil/netrelay/relay.goutil/netrelay/relay_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- util/netrelay/relay_test.go
- netrelay: only propagate CloseWrite on clean io.EOF; cancel both sides on any other copy error so a short write, reset, or broken pipe can't leave the opposite direction blocked. - TestTCPCapPrefersTombstonedForEviction: assert both live pre-cap entries survive, not just that the tombstone is gone, so a regression that evicts a live entry instead of the tombstone is caught.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
util/netrelay/relay.go (1)
108-126: Nit: optional DRY for the two copy goroutines.The teardown logic is duplicated across both goroutines; extracting a small closure makes the intent explicit and keeps the two branches in lockstep if the policy ever evolves (e.g., distinguishing more error classes).
♻️ Optional refactor
+ finish := func(err error, peer io.ReadWriteCloser) { + if halfCloseSupported && isCleanEOF(err) { + halfClose(peer) + return + } + cancel() + } + go func() { defer wg.Done() aToB, errAToB = copyTracked(b, a, &lastActivity) - if halfCloseSupported && isCleanEOF(errAToB) { - halfClose(b) - } else { - cancel() - } + finish(errAToB, b) }() go func() { defer wg.Done() bToA, errBToA = copyTracked(a, b, &lastActivity) - if halfCloseSupported && isCleanEOF(errBToA) { - halfClose(a) - } else { - cancel() - } + finish(errBToA, a) }()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/netrelay/relay.go` around lines 108 - 126, Extract the duplicated goroutine teardown into a small closure to eliminate the repeated logic: create a helper like runCopy := func(dst, src io.ReadWriter, bytesPtr *int64, errPtr *error) that calls copyTracked(dst, src, &lastActivity) (assigning to the provided bytesPtr and errPtr), then performs the existing halfCloseSupported && isCleanEOF(...) check to call halfClose(dst) or cancel(); then launch two goroutines that each defer wg.Done() and call runCopy(b, a, &aToB, &errAToB) and runCopy(a, b, &bToA, &errBToA) respectively. Ensure you keep references to wg, lastActivity, halfCloseSupported, isCleanEOF, halfClose, cancel, copyTracked, and the a/b variables so behavior is unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@util/netrelay/relay.go`:
- Around line 148-165: The watchdog loop uses tick := max(idle/2,
50*time.Millisecond) which forces a 50ms granularity and can delay detecting
very short idle timeouts; change this to use the smaller of idle/2 and 50ms
(tick := min(idle/2, 50*time.Millisecond)) so short IdleTimeouts are checked
more frequently, and add or reuse a min helper if needed; update the watchdog
function and any helper funcs referenced (watchdog, tick variable) accordingly.
---
Nitpick comments:
In `@util/netrelay/relay.go`:
- Around line 108-126: Extract the duplicated goroutine teardown into a small
closure to eliminate the repeated logic: create a helper like runCopy :=
func(dst, src io.ReadWriter, bytesPtr *int64, errPtr *error) that calls
copyTracked(dst, src, &lastActivity) (assigning to the provided bytesPtr and
errPtr), then performs the existing halfCloseSupported && isCleanEOF(...) check
to call halfClose(dst) or cancel(); then launch two goroutines that each defer
wg.Done() and call runCopy(b, a, &aToB, &errAToB) and runCopy(a, b, &bToA,
&errBToA) respectively. Ensure you keep references to wg, lastActivity,
halfCloseSupported, isCleanEOF, halfClose, cancel, copyTracked, and the a/b
variables so behavior is unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3871c822-200f-42fd-a7fb-486ee670d074
📒 Files selected for processing (2)
client/firewall/uspfilter/conntrack/cap_test.goutil/netrelay/relay.go
🚧 Files skipped from review as they are similar to previous changes (1)
- client/firewall/uspfilter/conntrack/cap_test.go
Use min(idle/2, 50ms) so very short idle timeouts (mainly in tests) are caught within one tick; the 50ms cap still keeps detection latency bounded for long idle values without needlessly frequent wakeups.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@util/netrelay/relay.go`:
- Around line 75-83: The Relay function starts a goroutine that closes a and b
on ctx cancellation but returns without waiting for that goroutine to finish,
making the “fully closed when Relay returns” guarantee racy; modify Relay to
wait for the closer goroutine to complete before returning: create a done signal
(e.g., a done channel or a sync.WaitGroup), have the closer goroutine signal
done after closing a and b, keep calling cancel() as before, and block on the
done signal just before Relay returns so the function only returns once the
closer goroutine has finished closing the endpoints (referencing Relay, ctx,
cancel, a, b and the closer goroutine).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 464882bf-95eb-41b0-8305-77986b556bbf
📒 Files selected for processing (1)
util/netrelay/relay.go
The closer goroutine ran asynchronously on ctx cancellation, so the "fully closed when Relay returns" guarantee was racy: callers could see the function return before a and b were actually Close()d. Wait on a done channel in the defer so the guarantee holds.
|



Describe your changes
Harden the userspace firewall and consolidate TCP byte-relaying.
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Internal firewall and relay plumbing; no user-facing API changes. New NB_CONNTRACK_* tuning env vars are opt-in with safe defaults.
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit