Scoring Version: 0.1.0 Agent-Friendly Docs Spec Version: v0.5.0 Spec URL: https://agentdocsspec.com Date: 04/25/2026
- Assign a 0-100 numerical score to a documentation site's agent-friendliness.
- Surface interaction diagnostics: system-level failures that emerge from combinations of check results and aren't visible from individual checks alone.
- The score should reflect how well agents can actually use the documentation, not just how many boxes are ticked. A site with perfect markdown support that no agent can discover should score lower than a site with imperfect markdown support that agents are directed to.
- Proportional scoring for multi-page checks: 3/50 pages failing is different
from 48/50 pages failing. The existing
detailsfields already contain per-page breakdowns; the scoring layer reads those rather than relying on the aggregated top-level status. - Skipped checks (due to failed dependencies) score 0 and are excluded from the denominator. The dependency failure already carries the penalty.
- The scoring module does not change check results. It is a read-only consumer
of
ReportResult.
Each check is assigned to a tier based on its observed impact on agent workflows. Tier assignments are derived from the spec's "Start Here" ordering and the empirical evidence sections in each check definition.
| Tier | Base weight | Criteria |
|---|---|---|
| Critical | 10 | Agents cannot function without this. Failure means zero content, zero navigation, or zero access. |
| High | 7 | Directly limits agent effectiveness. Failure means truncation, dead ends, or agents stuck on a worse path. |
| Medium | 4 | Significant but not blocking. Failure degrades quality or misses an opportunity. |
| Low | 2 | Refinements. Failure is a missed best practice, not an agent-blocking issue. |
| Check ID | Tier | Weight | Rationale |
|---|---|---|---|
llms-txt-exists |
Critical | 10 | Single highest-impact action per spec. Primary navigation mechanism for agents. |
rendering-strategy |
Critical | 10 | SPA shell = zero content. Not a truncation problem; a no-content problem. |
auth-gate-detection |
Critical | 10 | Auth-gated docs are completely opaque to agents. |
llms-txt-size |
High | 7 | Truncated index defeats the purpose of llms.txt. |
llms-txt-links-resolve |
High | 7 | Broken links in llms.txt send agents down dead ends with high confidence. |
llms-txt-links-markdown |
High | 7 | Agents work significantly less effectively with HTML content. Pointing to markdown directly is the difference between a good and degraded experience. |
markdown-url-support |
High | 7 | Core format capability. Markdown is what agents actually want. |
page-size-markdown |
High | 7 | Direct truncation risk on the best-case content path. |
page-size-html |
High | 7 | Affects the majority of agents, which receive HTML. |
http-status-codes |
High | 7 | Soft 404s actively mislead agents into extracting info from error pages. |
llms-txt-directive-html |
High | 7 | Discoverability multiplier for HTML path. Tells agents about llms.txt. |
llms-txt-directive-md |
Medium | 4 | Discoverability multiplier for markdown path. Tells agents about llms.txt. |
llms-txt-valid |
Medium | 4 | Structure helps parsing, but even non-standard llms.txt with links is useful. |
content-negotiation |
Medium | 4 | Only some agents send Accept: text/markdown. Valuable but not universal. |
content-start-position |
Medium | 4 | Boilerplate preamble on HTML path wastes truncation budget. |
tabbed-content-serialization |
Medium | 4 | Tabbed content can be catastrophic but only affects pages that use it. |
markdown-code-fence-validity |
Medium | 4 | Unclosed fences corrupt all content after the break point. |
llms-txt-coverage |
Medium | 4 | Stale index is a slow failure mode; broken links catch the acute version. |
markdown-content-parity |
Medium | 4 | Content drift between markdown and HTML leaves agents with outdated info. |
auth-alternative-access |
Medium | 4 | Partial mitigation for auth-gated sites. |
redirect-behavior |
Medium | 4 | Cross-host redirects are a known friction point for some agents. |
section-header-quality |
Low | 2 | Refinement for tabbed content; only matters when tabs exist. |
cache-header-hygiene |
Low | 2 | Aggressive caching rarely causes acute agent failures. |
Maximum raw score: 3(10) + 8(7) + 10(4) + 2(2) = 30 + 56 + 40 + 4 = 130 points
For single-resource checks (e.g., llms-txt-exists, llms-txt-valid):
check_score = status_coefficient * weight
status_coefficient:
pass = 1.0
warn = warn_coefficient (see table below)
fail = 0.0
skip = excluded from both numerator and denominator
error = 0.0
For multi-page checks with per-page breakdowns in details:
proportion = (pass_count * 1.0 + warn_count * warn_coefficient) / tested_count
check_score = proportion * weight
Not all warn states represent the same degree of degradation. Some mean "works with a cosmetic issue" while others mean "genuinely degraded with real risk." Each check has a specific warn coefficient rather than a uniform default.
| Check ID | Warn coeff | Rationale |
|---|---|---|
| 0.75: Content substantively intact | ||
llms-txt-valid |
0.75 | Non-standard structure, but links are parseable. Missing a blockquote doesn't prevent navigation. |
content-negotiation |
0.75 | Agent gets the markdown content; wrong Content-Type may prevent optimizations but the content itself is correct. |
llms-txt-links-resolve |
0.75 | >90% of links work. A few broken links is a maintenance issue, not a structural one. |
llms-txt-coverage |
0.75 | 80-95% of pages covered. Most of the site is represented in the index. |
markdown-content-parity |
0.75 | Minor formatting differences, not substantive content drift. |
| 0.60: Partial coverage or platform-dependent | ||
llms-txt-directive-html |
0.60 | Present in HTML of some pages but not others. Agents that land on covered pages benefit; others get no guidance. |
llms-txt-directive-md |
0.60 | Present in markdown of some pages but not others. |
redirect-behavior |
0.60 | Cross-host HTTP redirects: some agents follow them, some don't. Platform-dependent outcome. |
| 0.25: Actively steering agents away from the better path | ||
llms-txt-links-markdown |
0.25 | Markdown variants exist but llms.txt links to HTML. The one place you control agent navigation actively directs agents away from markdown. Agents don't independently discover .md variants. |
| 0.50: Genuine functional degradation | ||
llms-txt-exists |
0.50 | Only reachable via cross-host redirect. Binary: the agent's HTTP client either follows it or doesn't. |
llms-txt-size |
0.50 | 50K-100K characters. Real truncation risk on some agent platforms. |
rendering-strategy |
0.50 | Sparse content that may need JS. Genuinely uncertain whether agents get usable content. |
markdown-url-support |
0.50 | Inconsistent .md support across pages. Unpredictable is arguably worse than consistently absent. |
page-size-markdown |
0.50 | 50K-100K characters. Real truncation risk. |
page-size-html |
0.50 | 50K-100K characters post-conversion. Same truncation risk pattern. |
content-start-position |
0.50 | 10-50% boilerplate before content. Real waste of truncation budget. |
tabbed-content-serialization |
0.50 | 50K-100K serialized. Truncation risk for tab-heavy pages. |
section-header-quality |
0.50 | 25-50% generic headers. Agents lose variant context on a significant fraction. |
cache-header-hygiene |
0.50 | 1-24 hour caching. Updates may take hours to propagate. |
auth-gate-detection |
0.50 | Partial gating. Some docs accessible, some invisible to agents. |
auth-alternative-access |
0.50 | Partial alternative access. Covers some gated content but not all. |
Checks without a warn state (http-status-codes,
markdown-code-fence-validity) don't appear in this table. Their spec
definitions only have pass and fail levels.
This replaces the worst-case aggregation for scoring purposes. A site where 3/50 pages exceed the size limit scores ~94% of the check's weight, not 0%.
The top-level status field is unchanged and still uses worst-case for the
existing text/json formatters; proportional scoring is only applied in the
scoring layer.
Every multi-page check already stores the data needed. The scoring module reads
these fields from details:
| Check ID | Proportion source |
|---|---|
rendering-strategy |
serverRendered, sparseContent, spaShells (as pass/warn/fail counts) |
page-size-markdown |
passBucket, warnBucket, failBucket |
page-size-html |
passBucket, warnBucket, failBucket |
content-start-position |
passBucket, warnBucket, failBucket |
markdown-url-support |
pageResults array, count per-status |
content-negotiation |
pageResults array, count per-status |
markdown-code-fence-validity |
pageResults array, count per-status |
tabbed-content-serialization |
pageResults array, count per-status |
section-header-quality |
pageResults array, count per-status |
http-status-codes |
pageResults array, count per-status |
redirect-behavior |
pageResults array, count per-status |
llms-txt-directive-html |
pageResults array, count per-status |
llms-txt-directive-md |
pageResults array, count per-status |
cache-header-hygiene |
passBucket, warnBucket, failBucket |
markdown-content-parity |
passBucket, warnBucket, failBucket |
auth-gate-detection |
pageResults array, count per-status |
Single-resource checks (no proportional scoring needed):
| Check ID | Notes |
|---|---|
llms-txt-exists |
Binary: exists or doesn't |
llms-txt-valid |
Per-file average (see note below) |
llms-txt-size |
Per-file average (see note below) |
llms-txt-links-resolve |
Uses resolve rate directly from details (resolveRate field) |
llms-txt-links-markdown |
Percentage-based status |
llms-txt-coverage |
Coverage percentage |
auth-alternative-access |
Binary: alternative path exists or doesn't |
For llms-txt-links-resolve, the resolveRate field in details (a 0-1 float)
can be used directly as the proportion rather than mapping from status.
Multiple llms.txt files: When a site serves llms.txt at multiple locations
(e.g., /llms.txt and /docs/llms.txt), per-file checks (llms-txt-valid,
llms-txt-size, llms-txt-links-resolve, llms-txt-links-markdown) score
each file individually and average the results. This reflects that we can't
predict which file an agent will encounter, so the score should represent
the expected experience across discovery paths. Files that redirect to the
same destination are deduplicated before scoring (e.g., if /docs/llms.txt
308-redirects to /llms.txt, that's one file, not two).
score = (sum of check_scores for non-skipped, non-N/A checks)
/ (sum of weights for non-skipped, non-N/A checks)
* 100
Rounded to the nearest integer.
Each CheckScore has a scoreDisplayMode field:
"numeric"(default): normal scored result."notApplicable": insufficient data to score meaningfully. The check ran but its score is excluded from the overall and category calculations.
The notApplicable mode triggers when all of:
samplingStrategyisrandomordeterministic(discovery-based).testedPagesis less thanMIN_PAGES_FOR_SCORING(default 5).- The check is page-level (tests sampled pages, not site-level resources).
Page-level checks: llms-txt-directive-html, llms-txt-directive-md,
markdown-url-support, content-negotiation, markdown-code-fence-validity,
page-size-markdown, page-size-html, markdown-content-parity,
content-start-position, tabbed-content-serialization,
section-header-quality, http-status-codes, redirect-behavior,
rendering-strategy, auth-gate-detection, cache-header-hygiene.
Site-level checks (always numeric): llms-txt-exists, llms-txt-valid,
llms-txt-size, llms-txt-links-resolve, llms-txt-links-markdown,
llms-txt-coverage, auth-alternative-access.
Category scores: When all scored checks in a category are notApplicable,
the category score is null (rendered as a dash in the scorecard). Mixed
categories (some N/A, some numeric) score based on numeric checks only.
ReportResult fields: testedPages (number of pages tested by page-level
checks) and samplingStrategy (the strategy used for this run) are added to
ReportResult so the scoring layer can detect the insufficient-data condition.
Critical checks (weight 10) can cap the overall score when they fail broadly. This prevents a site with a fundamental agent-blocking problem from scoring well on the strength of lower-priority checks alone.
The cap is based on the check's proportion score (for multi-page checks) or its status (for single-resource checks):
For each critical check:
if single-resource AND status == fail:
apply cap (total failure)
if multi-page AND scoreDisplayMode == 'notApplicable':
skip (insufficient data to justify a cap)
if multi-page AND proportion <= 0.25 (75%+ of pages fail):
cap overall score at 39 (F)
if multi-page AND proportion <= 0.50 (50%+ of pages fail):
cap overall score at 59 (D)
For single-resource critical checks:
| Check | Cap |
|---|---|
llms-txt-exists fail |
Cap at 59 (D). Agents lose primary navigation but may still use HTML/markdown paths directly. |
llms-txt-exists caps at D rather than F because a site without llms.txt can
still be usable if it has good rendering, reasonable page sizes, and
discoverable markdown. It's a significant gap, not a total blocker.
For multi-page critical checks (rendering-strategy, auth-gate-detection):
| Proportion | Meaning | Cap |
|---|---|---|
| <= 0.25 | 75%+ of pages affected | Cap at 39 (F) |
| <= 0.50 | 50%+ of pages affected | Cap at 59 (D) |
| > 0.50 | Minority of pages affected | No cap; proportional scoring handles it |
When the no-viable-path interaction diagnostic fires (no llms.txt, no
discoverable markdown, HTML path broken or untested), the overall score is
capped at 39 (F). A site where agents have no effective way to access content
should not score above F regardless of how well the infrastructure checks
perform.
When multiple caps apply, the lowest cap wins.
The cap is applied after the weighted score calculation but diagnostics are evaluated first so that diagnostic-driven caps can participate. If the calculated score is already below the cap, the cap has no effect. The scorecard output should note when a cap is active, e.g.:
Overall Score: 39 / 100 (F)
(Capped: auth-gate-detection — all documentation requires authentication)
Some checks have conditional value: their contribution to the score should be scaled by whether the conditions needed to realize that value are actually met. This is modeled as a coefficient (0.0 to 1.0) that multiplies a check's score after the base calculation.
Applies to: page-size-markdown, markdown-code-fence-validity,
markdown-content-parity
These checks measure the quality of the markdown path. That path's value to agents depends on whether agents can discover it. In observed agent behavior, agents do not independently discover .md URL variants through training data or exploratory probing; they need to be told. This means undiscoverable markdown delivers zero value to agents today.
markdown-url-support is excluded from this coefficient because it measures
whether the capability exists, not the quality of an established path. A site
should get credit for serving markdown (and the markdown-undiscoverable or
markdown-partially-discoverable diagnostic tells them to make it
discoverable), but the downstream quality
checks only matter if agents actually reach the markdown.
discovery_coefficient:
content-negotiation pass -> 1.0 (mechanical; no agent decision involved)
llms-txt-directive-html pass -> 0.8 (effective but agents sometimes ignore
OR llms-txt-directive-md pass the directive even when present)
llms-txt-links-markdown pass -> 0.5 (requires finding llms.txt first,
then following .md links from it)
none of the above -> 0.0 (agents won't find the markdown path)
If multiple conditions are met, use the highest coefficient. These values are based on observed agent behavior as of early 2026 and may need recalibration as agent tooling evolves.
Applies to: page-size-html, content-start-position,
tabbed-content-serialization, section-header-quality
These checks measure the quality of the HTML path. If pages are SPA shells, HTML path measurements for those pages are meaningless.
html_path_coefficient = rendering-strategy proportion
The coefficient uses the same proportion as the rendering-strategy check's
own score: (pass_count * 1.0 + warn_count * 0.5) / tested_count. If 45/50
pages render correctly, the coefficient is 0.9. If 5/50 render correctly,
it's 0.1. This is consistent with proportional scoring throughout the system
and avoids double-discounting (the HTML path checks' own proportional scoring
handles per-page variation; the coefficient reflects the overall reliability
of the HTML path as a whole).
Applies to: llms-txt-links-resolve, llms-txt-valid,
llms-txt-coverage, llms-txt-links-markdown
If llms-txt-size fails, agents only see a fraction of the index. The quality
of the invisible portion doesn't affect agent experience.
index_truncation_coefficient:
llms-txt-size pass -> 1.0 (full index visible to agents)
llms-txt-size warn -> 0.8 (most visible, some platforms may truncate)
llms-txt-size fail -> visible_fraction (estimated from file size vs.
truncation limit, e.g., 100K / file_size)
For the fail case, the coefficient is derived from the actual file size in
details. A 200K file has coefficient ~0.5; a 4MB file has coefficient ~0.025.
Applies to: all non-auth checks collectively
If auth-gate-detection fails (all/most pages require auth), the other checks
are measuring resources that agents can't reach anyway. However, this is
complex to apply uniformly because some resources (llms.txt, markdown endpoints)
might be public even when the HTML docs are gated.
v1 approach: Don't apply a blanket auth coefficient. Instead, surface this
as an interaction diagnostic (see below). The auth-gate-detection check's
own weight (Critical, 10 points) already provides significant score impact.
Revisit in v2 if the diagnostic alone isn't sufficient.
A check may be subject to multiple coefficients. When this happens, multiply them together:
effective_score = base_check_score * coefficient_1 * coefficient_2 * ...
In the current design, the three coefficient groups (discovery, HTML path, index truncation) apply to disjoint sets of checks, so no check currently has more than one coefficient. The multiplication rule is documented for future extensibility if new coefficients are added that overlap with existing ones.
The effective weight denominator for the overall score should use the same coefficients:
effective_max = weight * coefficient_1 * coefficient_2 * ...
This ensures that a check whose coefficient is reduced contributes proportionally less to both the numerator and denominator, rather than deflating the score by having full weight in the denominator but reduced score in the numerator. When a coefficient is 0.0 (e.g., discovery coefficient with no discovery mechanism), the check is effectively excluded from the score entirely.
Interaction diagnostics are system-level findings that emerge from patterns across multiple check results. They are displayed in the scorecard output as a separate section, distinct from individual check results.
Each diagnostic has:
- ID: For programmatic use
- Trigger condition: Boolean expression over check statuses
- Severity:
critical|warning|info - Message: 2-4 sentences explaining the system-level impact
- Resolution: What to do about it
Some diagnostics reference the trigger state of other diagnostics (e.g.,
page-size-no-markdown-escape references whether markdown is undiscoverable
or only partially discoverable). The implementation must evaluate diagnostics
in dependency order: markdown-undiscoverable and
markdown-partially-discoverable first, then diagnostics that reference them.
- Severity: warning
- Triggers when:
markdown-url-supportpasses, ANDcontent-negotiationdoes not pass, ANDllms-txt-directive-htmldoes not pass. - Message: Your site serves markdown at .md URLs, but agents have no way to discover this. No agent-facing directive points to your llms.txt, and the server does not support content negotiation. Most agents will default to the HTML path and never benefit from your markdown support.
- Resolution: Add a directive near the top of each docs page pointing to
your llms.txt, and implement content negotiation for
Accept: text/markdown. The directive is the primary discovery mechanism (it reaches all agents); content negotiation provides a fast path for agents that request markdown by default.
- Severity: warning
- Triggers when:
markdown-url-supportpasses, ANDcontent-negotiationpasses, ANDllms-txt-directive-htmldoes not pass. - Message: Your site serves markdown and supports content negotiation, but has no agent-facing directive on HTML pages pointing to llms.txt. Agents that send Accept: text/markdown (Claude Code, Cursor, OpenCode) get markdown automatically, but the majority of agents fetch HTML by default and have no signal to try the markdown path.
- Resolution: Add a directive near the top of each docs page pointing to your llms.txt. If your site serves markdown, mention that in the directive too. The directive reaches all agents, not just the ones that request markdown by default.
- Severity: warning
- Triggers when:
llms-txt-existspasses ANDllms-txt-sizefails. - Message: Your llms.txt is {size} characters. Agents see roughly the first {truncation_limit} characters ({visible_pct}% of the file). Links, structure, and freshness beyond that point don't affect agent experience. Quality checks on the invisible portion are discounted in the score.
- Resolution: Split into a root index linking to section-level llms.txt files, each under 50,000 characters. See the spec's progressive disclosure recommendation.
- Severity: info
- Triggers when:
rendering-strategyfails or warns (proportionally: when >25% of sampled pages are SPA shells). - Message: {n} of {total} sampled pages use client-side rendering. Agents receive an empty shell for these pages instead of documentation content. Page size and content structure scores for the HTML path are discounted because they are partially measuring shells rather than content. {If markdown-url-support passes: "Your markdown path still works for agents that can discover it."} {If not: "Agents currently have no alternative path to content on affected pages."}
- Resolution: Enable server-side rendering or static generation for affected page types. If only specific page templates use client-side content loading, target those templates rather than rebuilding the entire site.
-
Severity: critical
-
Triggers when: (
llms-txt-existsfails OR (llms-txt-existspasses ANDllms-txt-links-resolveresolveRate < 10%)) AND (rendering-strategyfails ORrendering-strategynot run) AND (markdown-url-supportfails ORmarkdown-undiscoverabletriggered ORmarkdown-partially-discoverabletriggered).The expanded llms.txt condition recognizes that an llms.txt where <10% of links resolve is functionally equivalent to having no llms.txt: agents follow the links with high confidence and hit dead ends.
-
Message: Agents have no effective way to access your documentation. {If llms-txt-exists fails: "There is no llms.txt for navigation."} {If llms-txt-exists passes but links broken: "The llms.txt exists but only {resolveRate}% of links resolve, making it effectively unusable."} No discoverable markdown path, and the HTML responses either don't contain content or weren't tested. This is the lowest-possible agent accessibility state.
-
Resolution: The single highest-impact action is creating an llms.txt at your site root with working links. If your site uses client-side rendering, enabling server-side rendering is the second priority.
-
Score cap: When this diagnostic fires, the overall score is capped at 39 (F). See "Diagnostic-Driven Cap" in the Score Caps section.
- Severity: critical
- Triggers when:
auth-gate-detectionis fail ANDauth-alternative-accessis fail. - Message: Your documentation requires authentication, and no alternative access paths were detected. Agents that encounter your docs will fall back on training data or seek secondary sources that may be inaccurate.
- Resolution: Consider providing a public llms.txt as a navigational index, ungating API references and integration guides, or shipping docs with your SDK/package. See the spec's "Making Private Docs Agent-Accessible" section for options ordered by implementation effort.
- Severity: warning
- Triggers when:
page-size-htmlfails AND (markdown-url-supportfails ORmarkdown-undiscoverabletriggered ORmarkdown-partially-discoverabletriggered). - Message: {n} pages exceed agent truncation limits on the HTML path, and there is no discoverable markdown path for agents to get smaller representations. Agents will silently receive truncated content on these pages.
- Resolution: Either reduce HTML page sizes (break large pages, reduce inline CSS/JS), or provide markdown versions and ensure agents can discover them via content negotiation or an llms.txt directive.
- Severity: warning
- Triggers when:
samplingStrategyisrandomordeterministicANDtestedPagesis less thanMIN_PAGES_FOR_SCORING(default 5). - Message: Only {n} page(s) discovered and tested (minimum 5 needed for reliable scoring). Page-level category scores may not represent the site. These categories are marked as N/A in the score.
- Resolution: If your site has an llms.txt, ensure it contains working links so the tool can discover more pages. If testing a preview deployment, use --canonical-origin to rewrite cross-origin llms.txt links. You can also provide specific pages with --urls.
- Severity: warning
- Triggers when:
llms-txt-links-resolveran AND its details showsameOrigin.total === 0ANDcrossOrigin.total > 0. - Message: All {n} links in your llms.txt point to {dominant_origin}, not the origin being tested. This typically happens when testing a preview or staging deployment whose llms.txt still references the production domain. Page discovery falls back to a single page.
- Resolution: Use --canonical-origin to rewrite cross-origin links during testing.
- Severity: info
- Triggers when: Any check's
details.discoveryWarningsarray contains a string matching "gzipped sitemap". - Message: A gzipped sitemap was skipped during URL discovery. If this is the only sitemap source, it may have reduced the number of pages discovered for testing.
- Resolution: Provide an uncompressed sitemap.xml alongside the gzipped version, or supply specific pages via --urls for targeted testing.
- Severity: warning
- Triggers when: Across all checks that report
details.rateLimited, the total rate-limited count exceeds 20% of the total tested count (derived fromdetails.testedLinksordetails.pageResults.length). - Message: {pct}% of tested URLs returned HTTP 429 (rate limited). Check results may be unreliable because rate-limited requests are not retried indefinitely.
- Resolution: Increase --request-delay to slow down requests, or contact the site operator to allowlist your IP or user-agent for testing.
| Grade | Score | Description |
|---|---|---|
| A+ | 100 | Perfect. Every check passes. |
| A | 90-99 | Excellent. Agents can effectively navigate and consume this documentation. |
| B | 80-89 | Good. Minor improvements possible; agents can use most content. |
| C | 70-79 | Functional but with notable gaps. Some content is inaccessible or degraded. |
| D | 60-69 | Significant barriers. Agents struggle to use this documentation effectively. |
| F | 0-59 | Poor. Agents likely cannot use this documentation in a meaningful way. |
Agent-Friendly Docs Scorecard
==============================
Overall Score: 72 / 100 (C)
Category Scores:
Content Discoverability 72 / 100 (C)
Markdown Availability 60 / 100 (C)
Page Size and Truncation Risk 45 / 100 (D)
Content Structure 90 / 100 (A)
URL Stability and Redirects 100 / 100 (A)
Observability and Content Health 75 / 100 (B)
Authentication and Access 100 / 100 (A)
Interaction Diagnostics:
[!] Markdown support is undiscoverable
Your site serves markdown at .md URLs, but agents have no way to
discover this. No agent-facing directive points to your llms.txt,
and the server does not support content negotiation. Most agents
will default to the HTML path.
Fix: Add a blockquote directive near the top of each docs page
pointing to your llms.txt, or implement content negotiation for
Accept: text/markdown.
[!] Pages exceed size limits with no markdown escape
12 pages exceed agent truncation limits on the HTML path, and
there is no discoverable markdown path for smaller representations.
Fix: Reduce HTML page sizes or provide discoverable markdown versions.
Check Results:
Content Discoverability
PASS llms-txt-exists llms.txt found at /llms.txt
PASS llms-txt-valid Follows standard structure with H1, ...
WARN llms-txt-size llms.txt is 65,000 characters
Fix: If it grows further, split into nested llms.txt files ...
PASS llms-txt-links-resolve All links resolve
FAIL llms-txt-links-markdown Links point to HTML, not markdown
Fix: Update links to use .md URL variants ...
FAIL llms-txt-directive-html No directive detected in HTML of any tested page
Fix: Add a visually-hidden element near the top of each page ...
...
Each category's score is calculated the same way as the overall score, but only including checks in that category:
category_score = (sum of effective check_scores in category)
/ (sum of effective max weights in category)
* 100
Where "effective" means after applying cluster coefficients.