feat: Support `Accept: text/markdown` header for improved LLM usage

# The Problem

When most (if not all) agents fetches a website, they will prefer `text/markdown` content over `text/html`, because it uses way less tokens to parse markdown than HTML, and the LLMs don't actually execute any of the HTML/JS/CSS anyway (unlike search engines), they just look at the raw source being returned.

# The Proposal

I think it would be great if npmx.dev supported the `Accept: text/markdown` header (and variants) that LLMs use when fetching content, as it would reduce bandwidth usage for everyone, and improve the token usage (ie. costs) when agents fetch from npmx.dev. On the server, npmx would detect the header, and serve markdown instead of HTML. It's debatable whether `text/plain` should act the same way, I'm not sure, and I'm not sure it matters in practice.

It's important to note, that a markdown version of HTML will in almost any case be a lossy version compared to the original, but I think that's actually the intend here anyway.

## Implementations

I see two ways to achieve this:

1. A generic solution, that will just take whatever HTML was supposed to be returned by the server, and turn it into markdown before sending the response. We could use a package like [`turndown`](https://npmx.dev/package/turndown) for this. I don't have much familiarity with Nuxt.js, but in my mind this would be achievable with a middleware that modifies the response _after_ it has gone through the whole chain of server-side-rendering.
2. A site-specific solution where we add markdown-handlers for specific pages only, such as `/` (search) and `/package/[package]`

The upside of option 1, is that it's "future proof" in the sense that it would "just work" as npmx adds new pages with new features.

The upside of option 2, is that you get to be very specific about what exact content you want to put into the markdown. If we built this feature specifically for LLM-usage it would make sense, because you could tailor it to be as token-efficient as possible, eg. remove features that aren't relevant to LLMs like headers/footers/images, and even compress it by removing whitespaces [like Vercel experimented with recently](https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals#addressing-the-context-bloat-concern). The downside of the approach is that as we add more features to npmx, we have to remember to include those in the separate markdown-rendering pipeline as well, which I can imagine would often be forgotten, unless we came up with a clean architecture for this.

So the main question here is, would we built this primarily for LLM-usage or would we want to support _any_ markdown-reading usage of npmx? I don't know of any use cases besides LLMs for it, but I could be missing something.

Another option, would be to take the generic approach and then build the LLM-specific parts into an MCP server (separate issue incoming 🔜), where _it_ would take care of the context-compression because that is specifically only used with agents in mind.

# Additional Context

This was a concept discovered last year by the Next.js docs team (I can't find the post on X anymore) and later spread, eg. see:

https://xcancel.com/thdxr/status/1972421466953273392

I think this is a natural evolution of the `llms.txt` concept, where you have markdown content for LLMs on a separate path instead of with a separate header. The downside of `llms.txt` is, that from my experience, LLMs will rarely if ever automatically use those specific paths when fetching content, but just use the regular website instead. User's essentially have to explicitly instruct their agents to use `/llms.txt` instead, via system prompts, which makes it all a bit useless in my eyes. Depending on the headers is all automatic and seem way more impactful to me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support `Accept: text/markdown` header for improved LLM usage #725

The Problem

The Proposal

Implementations

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

feat: Support Accept: text/markdown header for improved LLM usage #725

Description

The Problem

The Proposal

Implementations

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

feat: Support `Accept: text/markdown` header for improved LLM usage #725