The intelligent routing layer for MCP servers.
Find the best tool for every agent task — scored on real execution data.
Website · API Docs · Browse Servers · Quick Start · SDK
You're building an AI agent. You need to pick MCP servers. There are hundreds of them. Which one actually works best for your task? Which one is cheapest? Most reliable? Which combination should you deploy together?
Nobody benchmarks MCP servers. Until now.
ToolRoute scores every MCP server on 5 dimensions using real execution telemetry — not GitHub stars, not vibes:
| Dimension | Weight | What It Measures |
|---|---|---|
| Output Quality | 35% | How good are the results? |
| Reliability | 25% | Does it fail? How often? |
| Efficiency | 15% | Speed and token usage |
| Cost | 15% | Actual dollar cost per run |
| Trust | 10% | Security, permissions, data handling |
Every server gets a ToolRoute Score (0–10) that combines these into a single number you can sort by, filter on, and route with.
|
|
|
|
|
|
npm install @toolroute/sdkimport { ToolRoute } from '@toolroute/sdk'
const tr = new ToolRoute({ agentName: 'my-agent' })
// Ask: "What's the best tool for this task?"
const rec = await tr.route({ task: 'scrape product pages and extract pricing' })
console.log(rec.recommended_skill) // → "firecrawl-mcp"
console.log(rec.confidence) // → 0.89
console.log(rec.alternatives) // → ["browserbase-mcp", "playwright-mcp"]
// After execution, report back (earns routing credits)
await tr.report({
skill: 'firecrawl-mcp',
outcome: 'success',
latency_ms: 1200,
cost_usd: 0.003,
quality_rating: 9
})ToolRoute is itself an MCP server. Your agent can query it like any other tool.
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"toolroute": {
"command": "npx",
"args": ["-y", "@toolroute/sdk", "--mcp"],
"env": {}
}
}
}Your Claude agent now has access to toolroute_route, toolroute_search, toolroute_compare, toolroute_report, and toolroute_missions.
Cursor / VS Code
Add to .cursor/mcp.json in your project root:
{
"mcpServers": {
"toolroute": {
"url": "https://toolroute.io/api/mcp",
"transport": "http"
}
}
}Direct HTTP
# Route a task
curl -X POST https://toolroute.io/api/route \
-H "Content-Type: application/json" \
-d '{"task": "find and summarize recent AI research papers"}'
# Report execution outcome
curl -X POST https://toolroute.io/api/report \
-H "Content-Type: application/json" \
-d '{"skill_slug": "exa-mcp-server", "outcome": "success", "latency_ms": 800}'┌─────────────────────────────────────────────────────────────────┐
│ THE ROUTING LOOP │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ │ │ SCORE │ │ ROUTE │ │ REPORT │ │
│ │ QUERY │───▶│ & │───▶│ & │───▶│ & │ │
│ │ │ │ RANK │ │ EXECUTE │ │ EARN │ │
│ └──────────┘ └──────────┘ └──────────┘ └────┬─────┘ │
│ ▲ │ │
│ └───────────────────────────────────────────────┘ │
│ scores improve over time │
└─────────────────────────────────────────────────────────────────┘
- Query — Your agent describes a task in natural language
- Score & Rank — ToolRoute evaluates 100+ MCP servers across 5 dimensions
- Route & Execute — Returns the best match with confidence score + fallback chain
- Report & Earn — Agent submits execution telemetry → earns routing credits → scores improve
The more agents use ToolRoute, the better the routing gets. It's a flywheel.
| Feature | Description |
|---|---|
| Task-based routing | Describe what you need in plain English. Get the best MCP server. |
| Confidence scoring | Every recommendation comes with a 0–1 confidence score |
| Fallback chains | If the primary tool fails, ToolRoute suggests ranked alternatives |
| Constraint routing | Optimize for quality, cost, speed, reliability, or trust |
| Deploy configs | One-click config generation for Claude, Cursor, and JSON |
| Pre-built stacks | Curated tool combinations: Research, Developer, Content, Sales, DevOps, Data |
| Never blocks | SDK has 800ms timeout — routing never slows your agent down |
| Feature | Description |
|---|---|
| Score badges | Embed your ToolRoute score in your README |
| Benchmark visibility | See how your server ranks against alternatives |
| Telemetry insights | Aggregated usage data from real agent workflows |
| Listing | Get discovered by agents searching for your capability |
Add your server's ToolRoute score to your README:
[](https://toolroute.io/skills/your-server-slug)| Page | What You'll Find |
|---|---|
| Servers | All 100+ MCP servers with scores, search, and filters |
| Tasks | Best tools for each task (web search, code review, data analysis...) |
| Leaderboards | Head-to-head rankings by category |
| Stacks | Pre-built tool combinations with deploy configs |
| Compare | Side-by-side comparison of up to 4 servers |
| Benchmarks | Competition-style benchmark events |
// Request
{
"task": "extract data from 50 product pages",
"constraints": {
"priority": "best_efficiency",
"max_cost_usd": 0.10
}
}
// Response
{
"recommended_skill": "firecrawl-mcp",
"confidence": 0.91,
"scores": {
"value_score": 8.7,
"output_score": 9.1,
"reliability_score": 8.8,
"efficiency_score": 8.2,
"cost_score": 7.9,
"trust_score": 8.5
},
"alternatives": [
{ "skill": "browserbase-mcp", "confidence": 0.82 },
{ "skill": "playwright-mcp", "confidence": 0.74 }
],
"recommended_combo": {
"name": "Research Stack",
"skills": ["brave-search-mcp", "firecrawl-mcp", "context7"]
}
}{
"skill_slug": "firecrawl-mcp",
"outcome": "success",
"latency_ms": 1200,
"cost_usd": 0.003,
"quality_rating": 9
}5 tools available: toolroute_route, toolroute_search, toolroute_compare, toolroute_missions, toolroute_report
Full API documentation at toolroute.io/api-docs
Agents that report execution outcomes earn routing credits:
| Contribution Type | Multiplier | Credits |
|---|---|---|
| Single run report | 1.0× | 3–10 |
| Comparative eval (A/B test) | 2.5× | 8–25 |
| Fallback chain report | 1.5× | 5–15 |
| Benchmark package | 4.0× | 15–40 |
Credits unlock priority routing and higher rate limits. The telemetry you submit makes every agent's routing better.
ToolRoute scores are outcome-backed, not opinion-based. Think Consumer Reports for AI tools.
Value Score = 0.35 × Output + 0.25 × Reliability + 0.15 × Efficiency
+ 0.15 × Cost + 0.10 × Trust
Overall Score = Value Score × 0.60 + Adoption × 0.20 + Freshness × 0.20
- Scores are on a 0–10 scale, capped at 9.8 (no perfect 10s)
- Scores update based on aggregated telemetry from real agent executions
- Minimum sample size required before scores are considered stable
- Transparent methodology — no pay-to-rank, no hidden boosts
# Clone
git clone https://github.com/grossiweb/ToolRoute.git
cd ToolRoute
# Environment
cp .env.local.example .env.local
# Fill in your Supabase credentials
# Install & run
npm install
npm run dev| Variable | Required | Description |
|---|---|---|
NEXT_PUBLIC_SUPABASE_URL |
Yes | Supabase project URL |
NEXT_PUBLIC_SUPABASE_ANON_KEY |
Yes | Supabase anonymous key |
SUPABASE_SERVICE_ROLE_KEY |
Yes | Supabase service role key (server-side only) |
GITHUB_TOKEN |
Optional | GitHub PAT for health signal cron |
Run migrations in order in your Supabase SQL Editor:
supabase/migrations/001_initial_schema.sql
supabase/migrations/002_seed_data.sql
...through...
supabase/migrations/018_fill_orphan_task_mappings.sql
toolroute.io
├── src/app/
│ ├── api/
│ │ ├── route/ → POST /api/route (task routing engine)
│ │ ├── mcp/ → POST /api/mcp (MCP JSON-RPC server)
│ │ ├── report/ → POST /api/report (telemetry ingestion)
│ │ ├── skills/ → GET /api/skills (catalog search)
│ │ └── cron/ → Score recalculation pipeline
│ ├── servers/ → Browse all MCP servers
│ ├── tasks/ → Best tools per task
│ ├── stacks/ → Deploy pre-built tool stacks
│ ├── compare/ → Side-by-side comparison
│ ├── leaderboards/ → Category rankings
│ └── olympics/ → Benchmark competitions
├── sdk/ → @toolroute/sdk npm package
└── supabase/ → Database migrations (001-018)
Stack: Next.js 14 (App Router) · Supabase (Postgres) · Tailwind CSS · Vercel
- Automated benchmark runner (agents compete in real-time)
- MCP server health monitoring (uptime, latency tracking)
- Agent identity system (persistent reputation across projects)
- LangChain / LangGraph / CrewAI integrations
- Self-hosted scoring pipeline for enterprise
- Community-submitted benchmark tasks
We welcome contributions! The best ways to help:
- Submit telemetry — Use ToolRoute in your agents and report outcomes
- Add MCP servers — Know a server we're missing? Submit it
- Improve scores — Run comparative benchmarks and submit results
- Code contributions — PRs welcome for new features, bug fixes, and integrations
MIT — see LICENSE
Building something with AI agents?
Install ToolRoute first. Route smarter. Ship faster. Spend less.
toolroute.io