Skip to content

[Feature]: Site type classification - differentiate API services from content sites #49

@IISweetHeartII

Description

@IISweetHeartII

Problem Statement

AX Score currently applies the same scoring criteria to all websites regardless of their type. When scanning a content site/blog (e.g., a static Astro blog), it recommends API-specific features like OpenAPI specifications, OAuth authentication, structured error codes, rate limit headers, and Retry-After headers. These recommendations are completely irrelevant for content sites and artificially deflate their scores.

For example, scanning https://log8.kr (a Korean tech blog built with Astro) returns a score of 27 with recommendations like:

  • "Provide OpenAPI specification" → The site has no public API
  • "Implement self-service authentication" → Blog content is publicly accessible
  • "Define structured error codes" → There are no API endpoints to return errors from
  • "Add rate limit headers" → No programmatic access endpoints exist
  • "Support Retry-After header" → Not applicable

This makes the score misleading. A well-optimized blog with comprehensive JSON-LD, proper meta tags, RSS feed, sitemap, and AI-crawler-friendly robots.txt gets a low score because it doesn't implement API infrastructure it doesn't need.

Proposed Solution

Introduce site type classification as the first step of the scanning process:

  1. Auto-detect site type by analyzing:

    • Presence of <article>, <main>, blog-like markup → Content site
    • Presence of /api/, /docs/, OpenAPI references → API service
    • Presence of both → Hybrid
  2. Adjust scoring criteria per type:

    Criteria API Service Content Site Hybrid
    OpenAPI spec High weight Skip (0 weight) Medium weight
    Authentication High weight Skip Medium weight
    Error codes High weight Skip Medium weight
    Rate limiting High weight Skip Medium weight
    JSON-LD structured data Medium weight High weight High weight
    llms.txt Medium weight High weight High weight
    RSS/Atom feed Low weight High weight Medium weight
    Sitemap XML Medium weight High weight High weight
    robots.txt AI crawlers Medium weight High weight High weight
    Semantic HTML Low weight High weight Medium weight
  3. Display site type in the report header so users understand the context.

Alternatives Considered

  • Manual site type selection: Users could pass --type content|api|hybrid flag. Simpler to implement but worse UX.
  • Separate scoring tools: Create different tools for different site types. Too fragmented.
  • Weighted composite: Keep one tool but let users override weights. Complex configuration burden.

The auto-detection approach is recommended because it provides the best UX while being technically feasible.

Use Case

As a blog/content site owner, I want my AX Score to reflect my site's actual AI discoverability rather than penalizing me for not implementing API infrastructure. Content sites have different AI agent interaction patterns — agents consume content via structured data, RSS, sitemaps, and crawling, not via API calls.

Additional Context

This is a foundational improvement that would make AX Score useful for the ~90% of websites that aren't pure API services. The current scoring effectively only works for developer platforms and API products, which severely limits the tool's addressable market and utility.

Reference: The AX Benchmark from Tech Stackups already differentiates between "API-centric" and "content-centric" agent experiences.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions