[client] Fall back to individual IP rules when ipset is unavailable#5401
[client] Fall back to individual IP rules when ipset is unavailable#5401zerotohero wants to merge 2 commits intonetbirdio:mainfrom
Conversation
|
Tyler seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
a5d72a7 to
e4f5766
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
client/firewall/iptables/acl_linux.go (1)
87-148: Misleadinglog.Errorffires before the fallback warning on a system without ip_setOn the very first
AddPeerFilteringcall when the ip_set module is absent:
flushIPSet(line 119) issues a netlink call → kernel returns an error that is notipset.ErrSetNotExist→log.Errorf("flush ipset %s before use: %v", ...)fires (line 123).createIPSetthen fails →log.Warnf("ipset not supported, falling back to individual IP rules: …")fires (line 127).Operators see an ERROR before seeing that the system is handling the situation gracefully, which is confusing. A lightweight fix is to suppress the flush-error log to Debug level when ipset turns out to be unsupported, or to skip the flush entirely when
!m.ipsetSupportedhas just been detected.💡 Suggested approach — demote flush error to Debug on ipset failure
if err := m.flushIPSet(ipsetName); err != nil { if errors.Is(err, ipset.ErrSetNotExist) { log.Debugf("flush ipset %s before use: %v", ipsetName, err) } else { - log.Errorf("flush ipset %s before use: %v", ipsetName, err) + log.Debugf("flush ipset %s before use: %v", ipsetName, err) } } if err := m.createIPSet(ipsetName); err != nil { log.Warnf("ipset not supported, falling back to individual IP rules: %v", err)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/iptables/acl_linux.go` around lines 87 - 148, The flush error log is firing as ERROR before we detect ipset is unsupported; change the noisy log.Errorf in the flushIPSet error path to a debug-level log (use log.Debugf) so the subsequent ipset-not-supported fallback (createIPSet failing and log.Warnf) isn't preceded by a misleading error; update the branch that currently does log.Errorf("flush ipset %s before use: %v", ipsetName, err) to log.Debugf while leaving the createIPSet failure handling and m.ipsetSupported=false fallback unchanged (symbols: flushIPSet, createIPSet, m.ipsetSupported, log.Errorf -> log.Debugf).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/nftables/manager_linux.go`:
- Around line 97-99: When m.initNoTrackChains fails, it currently leaves
m.notrackOutputChain and m.notrackPreroutingChain non-nil even though Flush()
may have failed; update the code so stale chain pointers are cleared on error
(either by setting m.notrackOutputChain = nil and m.notrackPreroutingChain = nil
when initNoTrackChains returns an error in Init, or by deferring a rollback
inside initNoTrackChains to nil-out those fields if Flush fails) so that
SetupEBPFProxyNoTrack's nil guard
(m.notrackOutputChain/m.notrackPreroutingChain) behaves correctly and prevents
later opaque flush errors.
---
Nitpick comments:
In `@client/firewall/iptables/acl_linux.go`:
- Around line 87-148: The flush error log is firing as ERROR before we detect
ipset is unsupported; change the noisy log.Errorf in the flushIPSet error path
to a debug-level log (use log.Debugf) so the subsequent ipset-not-supported
fallback (createIPSet failing and log.Warnf) isn't preceded by a misleading
error; update the branch that currently does log.Errorf("flush ipset %s before
use: %v", ipsetName, err) to log.Debugf while leaving the createIPSet failure
handling and m.ipsetSupported=false fallback unchanged (symbols: flushIPSet,
createIPSet, m.ipsetSupported, log.Errorf -> log.Debugf).
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@client/firewall/nftables/manager_linux.go`:
- Around line 97-99: When initNoTrackChains fails during Init,
m.notrackOutputChain and m.notrackPreroutingChain can remain non-nil even though
their chains were never flushed to the kernel; update Init (or the
initNoTrackChains error handling path) to explicitly set m.notrackOutputChain =
nil and m.notrackPreroutingChain = nil when initNoTrackChains returns an error
so the nil guard in SetupEBPFProxyNoTrack still works; ensure this cleanup
happens immediately after the failing Flush() or returned error from
initNoTrackChains (references: initNoTrackChains, Init, SetupEBPFProxyNoTrack,
m.notrackOutputChain, m.notrackPreroutingChain, AddChain, Flush).
e4f5766 to
255cf43
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/iptables/manager_linux.go`:
- Around line 86-88: The Close() method can return a spurious error when the raw
table is absent because initNoTrackChain() failures are swallowed; add a boolean
field (e.g., notrackInitialized) to Manager, set notrackInitialized = true only
after initNoTrackChain() succeeds in Init, and then guard the call to
cleanupNoTrackChain() in Close() so it only runs when notrackInitialized is true
(ensure you reference Manager.notrackInitialized, Init, initNoTrackChain, Close,
and cleanupNoTrackChain when making the change).
---
Duplicate comments:
In `@client/firewall/nftables/manager_linux.go`:
- Around line 97-101: Init may leave stale chain pointers set if
m.initNoTrackChains populates m.notrackOutputChain and m.notrackPreroutingChain
before a subsequent Flush()/error; ensure Init nils both m.notrackOutputChain
and m.notrackPreroutingChain on initNoTrackChains failure (as shown in the diff)
so SetupEBPFProxyNoTrack's nil-guard will block use of uncommitted chains;
confirm refreshNoTrackChains continues to populate chains only from committed
kernel state and that AddChain/Flush error paths do not leave live pointers.
Additional context: Nvidia Jetson JetPack kernel module availabilityWe discovered this issue on Nvidia Jetson Orin AGX devices running the Tegra kernel. The kernel module availability varies across JetPack releases:
On R36.4.3, both modules are absent, which means:
On R36.4.4, both modules are present, so the existing code works fine. This reinforces that the graceful fallback approach is the right fix, since kernel module availability can vary across otherwise similar systems. |
255cf43 to
d48af6e
Compare
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
client/firewall/iptables/acl_linux.go (2)
119-121:flushIPSeterrors silently downgraded to debug regardless of cause.Previously non-
ErrSetNotExistflush errors (e.g. permission denied, kernel error) were logged atErrorlevel. Now all failures areDebug. SincecreateIPSetfollows immediately and would also fail for the same underlying non-module-missing reasons, the Warning there covers the overall failure. The practical impact is one fewerError-level log entry for non-transient flush problems, which is a minor observability trade-off but functionally acceptable.♻️ Restore differentiated log level if desired
- if err := m.flushIPSet(ipsetName); err != nil { - log.Debugf("flush ipset %s before use: %v", ipsetName, err) - } + if err := m.flushIPSet(ipsetName); err != nil && !errors.Is(err, ipset.ErrSetNotExist) { + log.Warnf("flush ipset %s before use: %v", ipsetName, err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/iptables/acl_linux.go` around lines 119 - 121, The flushIPSet call currently logs all errors at Debug level which hides real failures; update the error handling around m.flushIPSet(ipsetName) to inspect the returned error and only log as Debug when errors.Is(err, ipset.ErrSetNotExist) (or the equivalent sentinel for "set not exist"), otherwise log as Errorf including ipsetName and err; keep the existing flow into createIPSet unchanged so that non-transient flush failures are surfaced as Error while the expected "not exist" case remains Debug.
122-143: Duplicated spec-building block in fallback path — optional refactor.Lines 128–135 duplicate the spec construction from lines 92–101 verbatim. Consider extracting a small helper:
♻️ Suggested refactor
+func buildRuleSpecs(ip net.IP, protocol string, sPort, dPort *firewall.Port, action firewall.Action, ipsetName string, wgIfaceName string) (specs, mangleSpecs []string) { + specs = filterRuleSpecs(ip, protocol, sPort, dPort, action, ipsetName) + mangleSpecs = slices.Clone(specs) + mangleSpecs = append(mangleSpecs, + "-i", wgIfaceName, + "-m", "addrtype", "--dst-type", "LOCAL", + "-j", "MARK", "--set-xmark", fmt.Sprintf("%#x", nbnet.PreroutingFwmarkRedirected), + ) + specs = append(specs, "-j", actionToStr(action)) + return +}Then in
AddPeerFiltering:- specs := filterRuleSpecs(ip, string(protocol), sPort, dPort, action, ipsetName) - - mangleSpecs := slices.Clone(specs) - mangleSpecs = append(mangleSpecs, - "-i", m.wgIface.Name(), - "-m", "addrtype", "--dst-type", "LOCAL", - "-j", "MARK", "--set-xmark", fmt.Sprintf("%#x", nbnet.PreroutingFwmarkRedirected), - ) - - specs = append(specs, "-j", actionToStr(action)) + specs, mangleSpecs := buildRuleSpecs(ip, string(protocol), sPort, dPort, action, ipsetName, m.wgIface.Name()) if ipsetName != "" { ... if err := m.createIPSet(ipsetName); err != nil { log.Warnf(...) m.ipsetSupported = false ipsetName = "" - // Regenerate specs without ipset - specs = filterRuleSpecs(ip, string(protocol), sPort, dPort, action, "") - mangleSpecs = slices.Clone(specs) - mangleSpecs = append(mangleSpecs, ...) - specs = append(specs, "-j", actionToStr(action)) + specs, mangleSpecs = buildRuleSpecs(ip, string(protocol), sPort, dPort, action, "", m.wgIface.Name()) } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/iptables/acl_linux.go` around lines 122 - 143, The code duplicates spec construction when ipset creation fails; extract the duplicated logic into a helper (e.g., buildPeerSpecs or assembleSpecs) and replace the repeated blocks in the AddPeerFiltering flow: move the calls that compute specs = filterRuleSpecs(...), mangleSpecs = slices.Clone(specs), append the mangle mark lines, and append "-j", actionToStr(action) into that helper which is invoked both in the ipset-fallback branch (after setting m.ipsetSupported=false and ipsetName="") and in the normal path before adding to ipset; update references to mangleSpecs and specs accordingly and keep createIPSet, addToIPSet, ipsetSupported, filterRuleSpecs, and actionToStr names to locate the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/firewall/iptables/acl_linux.go`:
- Line 39: The field ipsetSupported on aclManager is set to false on first
failure and never re-probed, which can permanently force per-peer -s rules after
a transient error; update the code to re-check or refresh ipset support instead
of permanently disabling it — e.g. add a reprobe path or a method
(RefreshIPSetSupport / probeIPSet or call new check in ensureIPSetSupported /
wherever m.ipsetSupported is set) that retries detecting ip_set on subsequent
operations or clears the flag on recoverable errors so systems that briefly
failed can revert to using ipset.
---
Nitpick comments:
In `@client/firewall/iptables/acl_linux.go`:
- Around line 119-121: The flushIPSet call currently logs all errors at Debug
level which hides real failures; update the error handling around
m.flushIPSet(ipsetName) to inspect the returned error and only log as Debug when
errors.Is(err, ipset.ErrSetNotExist) (or the equivalent sentinel for "set not
exist"), otherwise log as Errorf including ipsetName and err; keep the existing
flow into createIPSet unchanged so that non-transient flush failures are
surfaced as Error while the expected "not exist" case remains Debug.
- Around line 122-143: The code duplicates spec construction when ipset creation
fails; extract the duplicated logic into a helper (e.g., buildPeerSpecs or
assembleSpecs) and replace the repeated blocks in the AddPeerFiltering flow:
move the calls that compute specs = filterRuleSpecs(...), mangleSpecs =
slices.Clone(specs), append the mangle mark lines, and append "-j",
actionToStr(action) into that helper which is invoked both in the ipset-fallback
branch (after setting m.ipsetSupported=false and ipsetName="") and in the normal
path before adding to ipset; update references to mangleSpecs and specs
accordingly and keep createIPSet, addToIPSet, ipsetSupported, filterRuleSpecs,
and actionToStr names to locate the code.
Test resultsTested on two Nvidia Jetson Orin AGX systems with different JetPack releases: jetson1 — R36.4.3 (missing
|
| Check | jetson-dev-7 (no modules) | xap025 (has modules) |
|---|---|---|
| Traffic flows | Yes | Yes |
| notrack warning logged | Yes (expected) | No (expected) |
| notrack chain created | No (expected) | Yes (expected) |
| Clean shutdown (no spurious errors) | Yes | Yes |
Note: Both Jetsons use userspace WireGuard bind, so the ipset fallback path is not exercised (the AllowNetbird() accept-all rule handles traffic). The ipset fallback remains relevant for non-userspace-bind systems that lack ip_set.
On systems where the ip_set kernel module is not available (e.g. Nvidia Jetson with Tegra kernels), ipset creation fails which causes AddPeerFiltering to return an error. The ACL manager's applyPeerACLs then calls rollBack(), deleting all rules from NETBIRD-ACL-INPUT and leaving it empty. Since the INPUT chain has a DROP catch-all for the WireGuard interface, all tunnel traffic is blocked. This change: - Adds ipset fallback: on first ipset creation failure, sets ipsetSupported=false and regenerates rule specs using individual IP matching (-s ip) instead of ipset matching (-m set --set). Subsequent calls skip ipset entirely. - Makes initNoTrackChain (iptables) and initNoTrackChains (nftables) non-fatal. Missing iptable_raw/nft raw table only affects eBPF proxy NOTRACK rules, not core firewall functionality. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
d48af6e to
3b8cb7e
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
client/firewall/iptables/acl_linux.go (2)
126-148: Consider adding a test for the ipset-fallback path.The existing
TestIptablesManagerIPSetcovers only the happy path. A unit test that forcescreateIPSetto fail (e.g. by injecting a stub or running the existing harness with an artificially invalid ipset name) would lock in the two key invariants of this change: (a)AddPeerFilteringreturns a rule withipsetName == ""and-s <ip>specs, and (b) subsequent peers on the sameaclManagercontinue to use individual-IP rules (i.e.m.ipsetSupportedstays false). This would catch regressions if the fallback logic is refactored later.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/iptables/acl_linux.go` around lines 126 - 148, Add a unit test that exercises the ipset-fallback path by forcing createIPSet to fail (e.g. inject a stub/fake or run the test harness with an invalid ipset name) and then call AddPeerFiltering; assert the returned rule uses ipsetName == "" and contains a "-s <ip>" source-spec rather than an ipset reference, and then add a second peer on the same acl manager and assert m.ipsetSupported is false (i.e. subsequent peers continue to use individual-IP rules). Target the test near TestIptablesManagerIPSet and reference createIPSet, AddPeerFiltering, and the m.ipsetSupported field when locating code to modify.
126-147: Reduce duplication in the fallback branch.Lines 131-139 recompute the exact spec sequence already built at lines 96-105. Extracting a small helper (or restructuring so the specs are built once after
ipsetNameis finalized) keeps the two paths in lockstep — otherwise a future change to the mangle/filter spec shape has to be mirrored in two places, which is an easy spot to introduce drift (e.g. missing an-iflag or a mark value in only one branch).♻️ Example restructure
- ipsetName = transformIPsetName(ipsetName, sPort, dPort, action) - specs := filterRuleSpecs(ip, string(protocol), sPort, dPort, action, ipsetName) - - mangleSpecs := slices.Clone(specs) - mangleSpecs = append(mangleSpecs, - "-i", m.wgIface.Name(), - "-m", "addrtype", "--dst-type", "LOCAL", - "-j", "MARK", "--set-xmark", fmt.Sprintf("%#x", nbnet.PreroutingFwmarkRedirected), - ) - - specs = append(specs, "-j", actionToStr(action)) + ipsetName = transformIPsetName(ipsetName, sPort, dPort, action) + specs, mangleSpecs := m.buildRuleSpecs(ip, protocol, sPort, dPort, action, ipsetName) if ipsetName != "" { if ipList, ipsetExists := m.ipsetStore.ipset(ipsetName); ipsetExists { ... } if err := m.flushIPSet(ipsetName); err != nil { log.Debugf("flush ipset %s before use: %v", ipsetName, err) } if err := m.createIPSet(ipsetName); err != nil { log.Warnf("ipset not supported, falling back to individual IP rules: %v", err) m.ipsetSupported = false ipsetName = "" - - // Regenerate specs without ipset - specs = filterRuleSpecs(ip, string(protocol), sPort, dPort, action, "") - mangleSpecs = slices.Clone(specs) - mangleSpecs = append(mangleSpecs, - "-i", m.wgIface.Name(), - "-m", "addrtype", "--dst-type", "LOCAL", - "-j", "MARK", "--set-xmark", fmt.Sprintf("%#x", nbnet.PreroutingFwmarkRedirected), - ) - specs = append(specs, "-j", actionToStr(action)) + specs, mangleSpecs = m.buildRuleSpecs(ip, protocol, sPort, dPort, action, "") } else { ... } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/firewall/iptables/acl_linux.go` around lines 126 - 147, The code duplicates building of firewall rule specs in the ipset fallback branch; instead, determine ipsetName (attempt m.createIPSet and set m.ipsetSupported and ipsetName accordingly), then build the filter/mangle specs once using filterRuleSpecs and the common append operations (including "-i", m.wgIface.Name(), "-m", "addrtype", "--dst-type", "LOCAL", "-j", "MARK", "--set-xmark", fmt.Sprintf("%#x", nbnet.PreroutingFwmarkRedirected)) and the final "-j", actionToStr(action); if ipset creation succeeded keep the m.addToIPSet/ipsetStore.addIpList logic, otherwise skip adding to ipset but reuse the same single spec construction path. Optionally extract a helper like buildSpecs(ip, protocol, sPort, dPort, action, ipsetName) to centralize spec assembly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@client/firewall/iptables/acl_linux.go`:
- Around line 126-148: Add a unit test that exercises the ipset-fallback path by
forcing createIPSet to fail (e.g. inject a stub/fake or run the test harness
with an invalid ipset name) and then call AddPeerFiltering; assert the returned
rule uses ipsetName == "" and contains a "-s <ip>" source-spec rather than an
ipset reference, and then add a second peer on the same acl manager and assert
m.ipsetSupported is false (i.e. subsequent peers continue to use individual-IP
rules). Target the test near TestIptablesManagerIPSet and reference createIPSet,
AddPeerFiltering, and the m.ipsetSupported field when locating code to modify.
- Around line 126-147: The code duplicates building of firewall rule specs in
the ipset fallback branch; instead, determine ipsetName (attempt m.createIPSet
and set m.ipsetSupported and ipsetName accordingly), then build the
filter/mangle specs once using filterRuleSpecs and the common append operations
(including "-i", m.wgIface.Name(), "-m", "addrtype", "--dst-type", "LOCAL",
"-j", "MARK", "--set-xmark", fmt.Sprintf("%#x",
nbnet.PreroutingFwmarkRedirected)) and the final "-j", actionToStr(action); if
ipset creation succeeded keep the m.addToIPSet/ipsetStore.addIpList logic,
otherwise skip adding to ipset but reuse the same single spec construction path.
Optionally extract a helper like buildSpecs(ip, protocol, sPort, dPort, action,
ipsetName) to centralize spec assembly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cb30c2ca-5d79-45be-92a8-472781dcca79
📒 Files selected for processing (1)
client/firewall/iptables/acl_linux.go
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|



Describe your changes
On systems where the
ip_setkernel module is not available (e.g. Nvidia Jetson with Tegra kernels), ipset creation fails which causesAddPeerFilteringto return an error. The ACL manager'sapplyPeerACLsthen callsrollBack(), deleting all rules fromNETBIRD-ACL-INPUTand leaving it empty. Since the INPUT chain has a DROP catch-all for the WireGuard interface, all tunnel traffic is blocked.Root cause
ip_setkernel module missing →createIPSet()fails via netlinkAddPeerFilteringreturns error →applyPeerACLscallsrollBack()rollBack()deletes all newly added rules →NETBIRD-ACL-INPUTleft emptyESTABLISHED/RELATED → ACCEPT,jump NETBIRD-ACL-INPUT(empty),DROP -i wt0Changes
client/firewall/iptables/acl_linux.goipsetSupportedfield; on firstcreateIPSetfailure, fall back to individual-s iprulesIssue ticket number and link
No existing issue — discovered on Nvidia Jetson Orin AGX devices (Tegra kernel 5.15.148) where
ip_setandiptable_rawkernel modules are absent.Stack
Checklist
Documentation
Select exactly one:
This is an internal resilience fix — graceful fallback when the
ip_setkernel module is unavailable. No user-facing configuration or behavior changes that require documentation.Test plan
ip_setmodule: ipset behavior unchanged, rules use-m set --setas beforeip_setmodule (e.g. Nvidia Jetson Tegra kernel):NETBIRD-ACL-INPUTpopulated with individual-s <ip>ACCEPT rules instead of empty chain, and inbound tunnel traffic is no longer droppedSummary by CodeRabbit
Bug Fixes