Skip to content

Conversation

@jparise
Copy link
Contributor

@jparise jparise commented Oct 27, 2025

The implementation of HostName.validate was too generous. It considered
strings like ".example.com", "exa..mple.com", and "-example.com" to be
valid hostnames, which is incorrect according to RFC 1123 (the currently
accepted standard).

@alexrp alexrp requested a review from andrewrk October 27, 2025 01:56
@andrewrk andrewrk self-assigned this Oct 27, 2025
@andrewrk
Copy link
Member

Thanks, all these new unit tests are going to be really handy when porting this logic over to #25592 🙂

@nissarin
Copy link

There are domains with underscores in the wild.

; <<>> DiG 9.18.38 <<>> nomini02_but_w.lpmediastorage.com @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20951
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;nomini02_but_w.lpmediastorage.com. IN  A

;; ANSWER SECTION:
nomini02_but_w.lpmediastorage.com. 300 IN A     185.207.196.137
nomini02_but_w.lpmediastorage.com. 300 IN A     185.207.196.148

;; Query time: 12 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
;; WHEN: Mon Oct 27 18:04:28 CET 2025
;; MSG SIZE  rcvd: 94

@jparise
Copy link
Contributor Author

jparise commented Oct 27, 2025

There are domains with underscores in the wild.

Based on my understanding of the relevant RFCs, that isn't standards-compliant and would challenge our definition of "valid" without loosening our interpretation.

Some DNS record types also use underscores by convention, like _dmarc.example.com, but we might also consider those outside the scope of a "valid" hostname for the purposes of this function.

@nissarin
Copy link

In theory only the "hostname" can't contain underscores, everything else can and this includes not only SRV or TXT but also CNAMEs, which can point to A/AAAA records later on.

In practice authoritative DNS servers (knot, nsd, and afaik windows one, whatever it's called) supports underscores, BIND hides it behind an option (at least older versions), resolvers/cache servers don't care at all. Clients support it out of the box without fuss, it's a case of "de facto standard" and going against it will break some user code.

@jparise
Copy link
Contributor Author

jparise commented Oct 28, 2025

I think if we can define and document the rules for allowing _ characters, that seems reasonable to me. For a function like this that returns a "valid" boolean response (versus a parser function that can be more nuanced), I think we just need to be clear about our definition of "valid".

The existing isValidHostName implementation also doesn't allow _ characters.

@jparise jparise changed the title std.net: make isValidHostName RFC 1123-compliant std.Io.net: make HostName.validate RFC 1123-compliant Nov 5, 2025
The implementation of HostName.validate was too generous. It considered
strings like ".example.com", "exa..mple.com", and "-example.com" to be
valid hostnames, which is incorrect according to RFC 1123 (the currently
accepted standard).
@jparise
Copy link
Contributor Author

jparise commented Nov 5, 2025

I re-implemented this as HostName.validate now that std.Io has landed. The code and tests are the same as the original version. A few notes:

  1. I'm now using HostName.max_len, which is 255. My understanding is that 255 is the maximum length of a DNS record, and 253 is the maximum hostname length after you account for the record's length byte and terminator byte. I think 253 is therefore more correct, but we could also discuss that separately because it's used in multiple places.
  2. Now that we have ValidateError, we could add some additional validation errors like LabelTooLong or InvalidCharacter, but I'm not sure if callers would benefit from that level of detail.
  3. Per the above discussion, we could also support _ characters, which are valid DNS values but not entirely valid public hostnames. To implement that support, we would replace all of the calls to std.ascii.isAlphanumeric with a local function that also includes _ in the character set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants