-
-
Notifications
You must be signed in to change notification settings - Fork 253
Description
The Problem
When most (if not all) agents fetches a website, they will prefer text/markdown content over text/html, because it uses way less tokens to parse markdown than HTML, and the LLMs don't actually execute any of the HTML/JS/CSS anyway (unlike search engines), they just look at the raw source being returned.
The Proposal
I think it would be great if npmx.dev supported the Accept: text/markdown header (and variants) that LLMs use when fetching content, as it would reduce bandwidth usage for everyone, and improve the token usage (ie. costs) when agents fetch from npmx.dev. On the server, npmx would detect the header, and serve markdown instead of HTML. It's debatable whether text/plain should act the same way, I'm not sure, and I'm not sure it matters in practice.
It's important to note, that a markdown version of HTML will in almost any case be a lossy version compared to the original, but I think that's actually the intend here anyway.
Implementations
I see two ways to achieve this:
- A generic solution, that will just take whatever HTML was supposed to be returned by the server, and turn it into markdown before sending the response. We could use a package like
turndownfor this. I don't have much familiarity with Nuxt.js, but in my mind this would be achievable with a middleware that modifies the response after it has gone through the whole chain of server-side-rendering. - A site-specific solution where we add markdown-handlers for specific pages only, such as
/(search) and/package/[package]
The upside of option 1, is that it's "future proof" in the sense that it would "just work" as npmx adds new pages with new features.
The upside of option 2, is that you get to be very specific about what exact content you want to put into the markdown. If we built this feature specifically for LLM-usage it would make sense, because you could tailor it to be as token-efficient as possible, eg. remove features that aren't relevant to LLMs like headers/footers/images, and even compress it by removing whitespaces like Vercel experimented with recently. The downside of the approach is that as we add more features to npmx, we have to remember to include those in the separate markdown-rendering pipeline as well, which I can imagine would often be forgotten, unless we came up with a clean architecture for this.
So the main question here is, would we built this primarily for LLM-usage or would we want to support any markdown-reading usage of npmx? I don't know of any use cases besides LLMs for it, but I could be missing something.
Another option, would be to take the generic approach and then build the LLM-specific parts into an MCP server (separate issue incoming 🔜), where it would take care of the context-compression because that is specifically only used with agents in mind.
Additional Context
This was a concept discovered last year by the Next.js docs team (I can't find the post on X anymore) and later spread, eg. see:
https://xcancel.com/thdxr/status/1972421466953273392
I think this is a natural evolution of the llms.txt concept, where you have markdown content for LLMs on a separate path instead of with a separate header. The downside of llms.txt is, that from my experience, LLMs will rarely if ever automatically use those specific paths when fetching content, but just use the regular website instead. User's essentially have to explicitly instruct their agents to use /llms.txt instead, via system prompts, which makes it all a bit useless in my eyes. Depending on the headers is all automatic and seem way more impactful to me.