Skip to content

Conversation

Arqu
Copy link
Collaborator

@Arqu Arqu commented Aug 31, 2025

Description

Added 4 new fields to net report:

  • mapping_varies_by_dest_port_ipv4/ipv6 - Detects if NAT assigns different public ports for different destination ports
  • hairpinning_ipv4/ipv6 - Detects if devices can connect to their own external address

iroh-relay QUIC server now automatically listens on both main port and port + 1

Breaking Changes

Everything should be backwards compatible. We do extend the API surface a little bit on the net_report.

Notes & open questions

Im unsure if the hairpinning test should run in a loop in case we need to keep ourselves covered on network change or does that re-trigger the probe set completely?

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.
    • List all breaking changes in the above "Breaking Changes" section.
    • Open an issue or PR on any number0 repos that are affected by this breaking change. Give guidance on how the updates should be handled or do the actual updates themselves. The major ones are:

@Arqu Arqu requested review from ramfox and flub August 31, 2025 22:53
@Arqu Arqu self-assigned this Aug 31, 2025
@Arqu Arqu added this to iroh Aug 31, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Aug 31, 2025
@Arqu Arqu moved this from 🏗 In progress to 👀 In review in iroh Aug 31, 2025
Copy link

github-actions bot commented Aug 31, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3448/docs/iroh/

Last updated: 2025-09-05T14:03:11Z

@Arqu Arqu changed the title feat(net_report): extend probes for nat detection feat(net_report): extend probes for NAT detection Aug 31, 2025
@Arqu Arqu changed the title feat(net_report): extend probes for NAT detection feat(net_report)!: extend probes for NAT detection Aug 31, 2025
@flub
Copy link
Contributor

flub commented Sep 1, 2025

What's the motivation for this? Are their any relevant iroh bugs or discussions I've missed?

@matheus23
Copy link
Member

It seems related to n0-computer/iroh-n0des#37, which in turn is related to hole-punching debugging tooling/features mentioned here: https://www.notion.so/number-zero/n0des-rolling-sync-up-2545df1306fb807eb36bda59dd971e92#2545df1306fb8064a19bc8870ec9b1fb

@flub
Copy link
Contributor

flub commented Sep 1, 2025

What I find odd (admittedly without looking into this) is that mapping-varies-by-dest should already exist in iroh? And I have some recollection of hairpinning being recently removed (/cc @dignifiedquire ). So I'm confused why these things are/need to be added here. And hence I'm asking for some more context.

Copy link

github-actions bot commented Sep 2, 2025

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: f02e3a3

@Arqu
Copy link
Collaborator Author

Arqu commented Sep 2, 2025

What I find odd (admittedly without looking into this) is that mapping-varies-by-dest should already exist in iroh? And I have some recollection of hairpinning being recently removed (/cc @dignifiedquire ). So I'm confused why these things are/need to be added here. And hence I'm asking for some more context.

You're right. Mapping-varies-by-dest exists, but we also need mapping-varies-by-port to get the full spectrum.
You're also right about us removing hairpinning which sent me into a nice rabbit hole of what is required or not for different use cases. Pretty sure we don't need hairpinning detection. It either works or you should be using local discovery anyways so that is no longer really important.

The reasoning behind this is to extend the doctor report with some more details and above all to provide us with some more details about our NAT. This feeds into the final result where we can use this info to generate a network fingerprint for users and also just nicely present their NAT state.

@Arqu Arqu requested a review from matheus23 September 2, 2025 12:41
@flub
Copy link
Contributor

flub commented Sep 2, 2025

What I find odd (admittedly without looking into this) is that mapping-varies-by-dest should already exist in iroh? And I have some recollection of hairpinning being recently removed (/cc @dignifiedquire ). So I'm confused why these things are/need to be added here. And hence I'm asking for some more context.

You're right. Mapping-varies-by-dest exists, but we also need mapping-varies-by-port to get the full spectrum.

I guess it's still not clear to me why we want to name a NAT type that we're not actually bothering identifying in iroh. And if this code lives in iroh it's doubly weird because then it only exists for the public API, but we've been treating this API as internal.

Now in general it's probably not bad to have a tool that can detect more NAT types. E.g. once we start trying to support some Destination Endpoint Dependent NAT types it will be useful to collect information about how the mapping varies. But that will have to be even more custom because we'd want to find out if there are patterns in ports and IPs chosen by the mapping. But that's like quite experimental, and I'd prefer if that code could live just inside of the doctor.

The reasoning behind this is to extend the doctor report with some more details and above all to provide us with some more details about our NAT. This feeds into the final result where we can use this info to generate a network fingerprint for users and also just nicely present their NAT state.

The current design of the endpoint providing a report is with the explicit intention that you get exactly what iroh uses internally, so we can debug why iroh fails for some users. Adding extra information into the report coming from iroh seems the wrong approach to providing something extra, because it breaks that 1:1 mapping. So if we do want to provide extra information on top of what iroh itself uses, perhaps it should live outside of iroh?

@Arqu
Copy link
Collaborator Author

Arqu commented Sep 3, 2025

What I find odd (admittedly without looking into this) is that mapping-varies-by-dest should already exist in iroh? And I have some recollection of hairpinning being recently removed (/cc @dignifiedquire ). So I'm confused why these things are/need to be added here. And hence I'm asking for some more context.

You're right. Mapping-varies-by-dest exists, but we also need mapping-varies-by-port to get the full spectrum.

I guess it's still not clear to me why we want to name a NAT type that we're not actually bothering identifying in iroh. And if this code lives in iroh it's doubly weird because then it only exists for the public API, but we've been treating this API as internal.

This only extends the existing net report with 2 new fields. Nothing else really changes API wise. None of the NAT classification code is actually in here.
The changes in the PR are:

  • added mapping_varies_by_port_[ipv4/ipv6] to the report
  • extended probing so that we can actually capture that info
  • extended relay to accept probes on another port for it.

It also is public API, at least the report, you want users to be able to introspect their network params, our most prominant use case for it currently is iroh-doctor report

Now in general it's probably not bad to have a tool that can detect more NAT types. E.g. once we start trying to support some Destination Endpoint Dependent NAT types it will be useful to collect information about how the mapping varies. But that will have to be even more custom because we'd want to find out if there are patterns in ports and IPs chosen by the mapping. But that's like quite experimental, and I'd prefer if that code could live just inside of the doctor.

How the mapping varies is exactly what we're capturing here. If we later need to extend with even more granular details we can follow up from here. Agree we can move some of the fancier exploration logic for port preferences etc up into doctor.

The reasoning behind this is to extend the doctor report with some more details and above all to provide us with some more details about our NAT. This feeds into the final result where we can use this info to generate a network fingerprint for users and also just nicely present their NAT state.

The current design of the endpoint providing a report is with the explicit intention that you get exactly what iroh uses internally, so we can debug why iroh fails for some users. Adding extra information into the report coming from iroh seems the wrong approach to providing something extra, because it breaks that 1:1 mapping. So if we do want to provide extra information on top of what iroh itself uses, perhaps it should live outside of iroh?

IMHO this is not that of an extreme addition that's outside of the scope of iroh. It's also very specific to the given endpoint and definitely part of the same stack.

@@ -2766,11 +2766,11 @@ dependencies = [

[[package]]
name = "matchers"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these updates needed?

if let Ok(relay_addr) =
reportgen::get_relay_addr_ipv4(&dns_resolver, &relay_node).await
{
let port_variation = relay_addr.port() + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assuming that this is always port + 1 is potentially problematic, I think we should make this somewhat configurable if we really need this

@dignifiedquire
Copy link
Contributor

I feel pretty strongly that these probes should not be run, unless explicitly triggered for the doctor, this increases the amount of traffic without any benefit on regular iroh operations

@flub
Copy link
Contributor

flub commented Sep 5, 2025

* extended relay to accept probes on another port for it.

So you are making relays listen on two ports? Why is that needed? To determine if you're behind a Destination Endpoint Dependent NAT you contact two different QAD servers and compare the results. But each QAD server only needs to listen on one socket address itself.

@Arqu
Copy link
Collaborator Author

Arqu commented Sep 5, 2025

* extended relay to accept probes on another port for it.

So you are making relays listen on two ports? Why is that needed? To determine if you're behind a Destination Endpoint Dependent NAT you contact two different QAD servers and compare the results. But each QAD server only needs to listen on one socket address itself.

Not really, this is to determine if you're behind a Destination Port Dependant NAT. Which is about the fact that your port changes depending on their ip and port combination not just ip. Which makes holepunching much harder and is used as part of the classification logic for NATs downstream.

@Arqu
Copy link
Collaborator Author

Arqu commented Sep 5, 2025

I've just pushed an update that cleans it up a bit and fixes tests. Still need to feature gate the port variation probes.

@Frando
Copy link
Member

Frando commented Sep 5, 2025

Which is about the fact that your port changes depending on their ip and port combination not just ip. Which makes holepunching much harder and is used as part of the classification logic for NATs downstream.

I'm still struggling to understand how this makes holepunching harder in iroh's case. If peer A is behind a destination dependent NAT mapping, it is already the case that the port returned from Stun/QAD cannot be used by other peers to connect. So if both peers are behind such NATs, holepunching fails (unless we would do birthday guessing stuff, which we aren't). Or am I missing a case where this would succeed, but would fail if one of the peers were behind a port dependent NAT?

@Arqu
Copy link
Collaborator Author

Arqu commented Sep 11, 2025

All of this functionality is being moved into n0-computer/iroh-doctor#48

@Arqu Arqu closed this Sep 11, 2025
@github-project-automation github-project-automation bot moved this from 👀 In review to ✅ Done in iroh Sep 11, 2025
@Arqu Arqu deleted the arqu/nat_probes branch September 11, 2025 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

5 participants