Skip to content

Add current AI crawler user-agents to robots-txt checker#3

Open
federicobartoli wants to merge 1 commit intoaddyosmani:mainfrom
federicobartoli:feat/add-current-ai-crawler-user-agents
Open

Add current AI crawler user-agents to robots-txt checker#3
federicobartoli wants to merge 1 commit intoaddyosmani:mainfrom
federicobartoli:feat/add-current-ai-crawler-user-agents

Conversation

@federicobartoli
Copy link
Copy Markdown

Adds seven currently documented AI crawler user-agents missing from AI_AGENTS.crawlers. Without them the robots-txt checker can't detect sites that block or only allow these agents.

User-agent Vendor Source
Claude-User, Claude-SearchBot Anthropic support.claude.com/8896518
OAI-SearchBot OpenAI platform.openai.com/docs/bots
Applebot-Extended Apple support.apple.com/en-us/119829
Meta-ExternalAgent, Meta-ExternalFetcher Meta developers.facebook.com/docs/sharing/bot
Perplexity-User Perplexity docs.perplexity.ai/guides/bots

Claude-Web retained for backward compatibility. docs/checkers.md updated accordingly.

npm test passes — fixtures only reference ClaudeBot/GPTBot.

Without these, the robots-txt checker can't tell when a site blocks
or only allows them.

- Claude-User, Claude-SearchBot — https://support.claude.com/en/articles/8896518
- OAI-SearchBot — https://platform.openai.com/docs/bots
- Applebot-Extended — https://support.apple.com/en-us/119829
- Meta-ExternalAgent, Meta-ExternalFetcher — https://developers.facebook.com/docs/sharing/bot/
- Perplexity-User — https://docs.perplexity.ai/guides/bots
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant