-
Notifications
You must be signed in to change notification settings - Fork 294
feat(net_report)!: extend probes for NAT detection #3448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3448/docs/iroh/ Last updated: 2025-09-05T14:03:11Z |
What's the motivation for this? Are their any relevant iroh bugs or discussions I've missed? |
It seems related to n0-computer/iroh-n0des#37, which in turn is related to hole-punching debugging tooling/features mentioned here: https://www.notion.so/number-zero/n0des-rolling-sync-up-2545df1306fb807eb36bda59dd971e92#2545df1306fb8064a19bc8870ec9b1fb |
What I find odd (admittedly without looking into this) is that mapping-varies-by-dest should already exist in iroh? And I have some recollection of hairpinning being recently removed (/cc @dignifiedquire ). So I'm confused why these things are/need to be added here. And hence I'm asking for some more context. |
68e5686
to
be38c51
Compare
You're right. Mapping-varies-by-dest exists, but we also need mapping-varies-by-port to get the full spectrum. The reasoning behind this is to extend the doctor report with some more details and above all to provide us with some more details about our NAT. This feeds into the final result where we can use this info to generate a network fingerprint for users and also just nicely present their NAT state. |
I guess it's still not clear to me why we want to name a NAT type that we're not actually bothering identifying in iroh. And if this code lives in iroh it's doubly weird because then it only exists for the public API, but we've been treating this API as internal. Now in general it's probably not bad to have a tool that can detect more NAT types. E.g. once we start trying to support some Destination Endpoint Dependent NAT types it will be useful to collect information about how the mapping varies. But that will have to be even more custom because we'd want to find out if there are patterns in ports and IPs chosen by the mapping. But that's like quite experimental, and I'd prefer if that code could live just inside of the doctor.
The current design of the endpoint providing a report is with the explicit intention that you get exactly what iroh uses internally, so we can debug why iroh fails for some users. Adding extra information into the report coming from iroh seems the wrong approach to providing something extra, because it breaks that 1:1 mapping. So if we do want to provide extra information on top of what iroh itself uses, perhaps it should live outside of iroh? |
This only extends the existing net report with 2 new fields. Nothing else really changes API wise. None of the NAT classification code is actually in here.
It also is public API, at least the report, you want users to be able to introspect their network params, our most prominant use case for it currently is
How the mapping varies is exactly what we're capturing here. If we later need to extend with even more granular details we can follow up from here. Agree we can move some of the fancier exploration logic for port preferences etc up into doctor.
IMHO this is not that of an extreme addition that's outside of the scope of iroh. It's also very specific to the given endpoint and definitely part of the same stack. |
@@ -2766,11 +2766,11 @@ dependencies = [ | |||
|
|||
[[package]] | |||
name = "matchers" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these updates needed?
iroh/src/net_report.rs
Outdated
if let Ok(relay_addr) = | ||
reportgen::get_relay_addr_ipv4(&dns_resolver, &relay_node).await | ||
{ | ||
let port_variation = relay_addr.port() + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think assuming that this is always port + 1 is potentially problematic, I think we should make this somewhat configurable if we really need this
I feel pretty strongly that these probes should not be run, unless explicitly triggered for the doctor, this increases the amount of traffic without any benefit on regular iroh operations |
So you are making relays listen on two ports? Why is that needed? To determine if you're behind a Destination Endpoint Dependent NAT you contact two different QAD servers and compare the results. But each QAD server only needs to listen on one socket address itself. |
Not really, this is to determine if you're behind a Destination Port Dependant NAT. Which is about the fact that your port changes depending on their ip and port combination not just ip. Which makes holepunching much harder and is used as part of the classification logic for NATs downstream. |
I've just pushed an update that cleans it up a bit and fixes tests. Still need to feature gate the port variation probes. |
I'm still struggling to understand how this makes holepunching harder in iroh's case. If peer A is behind a destination dependent NAT mapping, it is already the case that the port returned from Stun/QAD cannot be used by other peers to connect. So if both peers are behind such NATs, holepunching fails (unless we would do birthday guessing stuff, which we aren't). Or am I missing a case where this would succeed, but would fail if one of the peers were behind a port dependent NAT? |
All of this functionality is being moved into n0-computer/iroh-doctor#48 |
Description
Added 4 new fields to net report:
mapping_varies_by_dest_port_ipv4/ipv6
- Detects if NAT assigns different public ports for different destination portshairpinning_ipv4/ipv6
- Detects if devices can connect to their own external addressiroh-relay
QUIC server now automatically listens on both main port and port + 1Breaking Changes
Everything should be backwards compatible. We do extend the API surface a little bit on the net_report.
Notes & open questions
Im unsure if the hairpinning test should run in a loop in case we need to keep ourselves covered on network change or does that re-trigger the probe set completely?
Change checklist
quic-rpc
iroh-gossip
iroh-blobs
dumbpipe
sendme