Never stop coding. Save 15-95% eligible tokens with RTK+Caveman compression + auto-fallback to FREE & low-cost AI models.
The most complete open-source AI proxy — one endpoint, 160+ providers, 13 routing strategies, zero downtime. Multi-platform: Web, Desktop (Electron), Mobile (PWA + Termux). Fully extensible via MCP Server (37 tools), A2A Protocol, and Memory/Skills systems. Available in 40+ languages.
Chat Completions • Responses API • Embeddings • Image Generation • Video • Music • Audio Speech/Transcription • Reranking • Moderations • Web Search • MCP Server • A2A Protocol • 4,600+ Tests • 100% TypeScript
🔥 Limited offer: Sign up at AgentRouter and get $100 in free AI credits
Access GPT-5, Claude, Gemini, DeepSeek & 100+ models. No credit card required. Claim your credits →
🚀 Quick Start • 💡 Features • 🗜️ Compression • 💰 Pricing • 🎯 Use Cases • 🌍 Proxy • ❓ FAQ • 📖 Docs • 💬 WhatsApp
🌐 Available in: 🇺🇸 English | 🇧🇷 Português (Brasil) | 🇪🇸 Español | 🇫🇷 Français | 🇮🇹 Italiano | 🇷🇺 Русский | 🇨🇳 中文 (简体) | 🇩🇪 Deutsch | 🇮🇳 हिन्दी | 🇹🇭 ไทย | 🇺🇦 Українська | 🇸🇦 العربية | 🇯🇵 日本語 | 🇻🇳 Tiếng Việt | 🇧🇬 Български | 🇩🇰 Dansk | 🇫🇮 Suomi | 🇮🇱 עברית | 🇭🇺 Magyar | 🇮🇩 Bahasa Indonesia | 🇰🇷 한국어 | 🇲🇾 Bahasa Melayu | 🇳🇱 Nederlands | 🇳🇴 Norsk | 🇵🇹 Português (Portugal) | 🇷🇴 Română | 🇵🇱 Polski | 🇸🇰 Slovenčina | 🇸🇪 Svenska | 🇵🇭 Filipino | 🇨🇿 Čeština
Click to see dashboard screenshots
| Page | Screenshot |
|---|---|
| Providers | ![]() |
| Combos | ![]() |
| Analytics | ![]() |
| Health | ![]() |
| Translator | ![]() |
| Settings | ![]() |
| CLI Tools | ![]() |
| Usage Logs | ![]() |
| Endpoints | ![]() |
Connect any AI-powered IDE or CLI tool through OmniRoute — free API gateway for unlimited coding.
|
OpenClaw ⭐ 205K |
NanoBot ⭐ 20.9K |
PicoClaw ⭐ 14.6K |
ZeroClaw ⭐ 9.9K |
IronClaw ⭐ 2.1K |
|
OpenCode ⭐ 106K |
Codex CLI ⭐ 60.8K |
Claude Code ⭐ 67.3K |
Gemini CLI ⭐ 94.7K |
Kilo Code ⭐ 15.5K |
📡 All agents connect via http://localhost:20128/v1 or http://cloud.omniroute.online/v1 — one config, unlimited models and quota
🇧🇷 Português Guia completo do OmniRoute |
🇺🇸 English Complete OmniRoute walkthrough |
🇷🇺 Русский Полное руководство по OmniRoute |
🎬 Made a video about OmniRoute? We'd love to feature it here! Open an issue or discussion with the link and we'll add it to this showcase.
Stop wasting money, tokens and hitting limits:
❌ Subscription quota expires unused every month
❌ Rate limits stop you mid-coding
❌ Tool outputs (git diff, grep, ls...) burn tokens fast
❌ Expensive APIs ($20-50/month per provider)
❌ Manual switching between providers
❌ Each provider has a different API format
❌ AI providers blocked in your country
OmniRoute solves all of this:
✅ Prompt Compression — auto-compress prompts & tool outputs, save 15-95% eligible tokens per request with RTK+Caveman stacked mode ✅ Maximize subscriptions — track quota, use every bit before reset ✅ Auto fallback — Subscription → API Key → Cheap → Free, zero downtime ✅ Multi-account — round-robin between accounts per provider ✅ Format translation — OpenAI ↔ Claude ↔ Gemini ↔ Responses API, any tool works ✅ 3-level proxy — bypass geo-blocks with global, per-provider, and per-key proxies ✅ 10 multi-modal APIs — chat, images, video, music, audio, search in one endpoint ✅ MCP + A2A — 29 MCP tools + agent-to-agent protocol, production-ready ✅ Universal — works with Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, any CLI tool
💬 Join our community! WhatsApp Group — Get help, share tips, and stay updated.
- Website: omniroute.online
- GitHub: github.com/diegosouzapw/OmniRoute
- Issues: github.com/diegosouzapw/OmniRoute/issues
- WhatsApp: Community Group
- Contributing: See CONTRIBUTING.md, open a PR, or pick a
good first issue - Original Project: 9router by decolua
When opening an issue, please run the system-info command and attach the generated file:
npm run system-infoThis generates a system-info.txt with your Node.js version, OmniRoute version, OS details, installed CLI tools (qoder, gemini, claude, codex, antigravity, droid, etc.), Docker/PM2 status, and system packages — everything we need to reproduce your issue quickly. Attach the file directly to your GitHub issue.
OmniRoute works seamlessly with 16+ AI coding tools — one config, all tools:
| Claude Code Anthropic |
Codex CLI OpenAI |
Gemini CLI |
Cursor IDE |
OpenClaw CLI |
Antigravity VS Code |
| Cline Extension |
Continue Extension |
Kilo Code Extension |
Kiro AWS IDE |
OpenCode CLI |
Droid CLI |
| AMP CLI |
Copilot GitHub |
Windsurf IDE |
Hermes CLI |
Qwen CLI Alibaba |
Custom Any tool |
📖 Full setup for each tool: docs/CLI-TOOLS.md
| Claude Code Anthropic OAuth |
Antigravity Google OAuth |
Codex OpenAI OAuth |
GitHub Copilot GitHub OAuth |
Cursor Cursor OAuth |
| Kimi Coding Moonshot OAuth |
Kilo Code Kilo OAuth |
Cline Cline OAuth |
||
| 🟢 Kiro AI Claude Sonnet/Haiku Unlimited FREE |
🟢 Qoder AI Kimi-K2, DeepSeek-R1 Unlimited FREE |
🟢 Pollinations GPT-5, Claude, Llama 4 No API key needed |
🟢 Qwen Code Qwen3 Coder Plus Unlimited FREE |
| 🟢 LongCat AI Flash-Lite 50M tokens/day |
🟢 Cloudflare AI 50+ models 10K neurons/day |
🟢 Puter AI GPT-4.1, Claude Rate-limited free |
🟢 NVIDIA NIM Llama, Mistral 1K req/day free |
| OpenAI | Anthropic | Gemini | DeepSeek | Groq | xAI (Grok) |
| Mistral | OpenRouter | GLM | Kimi | MiniMax | Fireworks |
| Together AI | Cerebras | Cohere | NVIDIA | Perplexity | SiliconFlow |
| Nebius | HuggingFace | DeepInfra | SambaNova | Vertex AI | Azure OpenAI |
| AWS Bedrock | Snowflake | Databricks | Venice.ai | AI21 Labs | Meta Llama |
...and 90+ more providers
Alibaba · Amazon Q · AssemblyAI · Baidu Qianfan · Baseten · Black Forest Labs · Blackbox · Brave Search · Bytez · CablyAI · Cartesia · ChatGPT Web · Chutes.ai · Clarifai · Codestral · CrofAI · DataRobot · Deepgram · ElevenLabs · Empower · Exa Search · Fal.ai · Featherless AI · FenayAI · FriendliAI · Galadriel · GigaChat · GitLab Duo · GLHF Chat · GoAPI · Heroku AI · Hyperbolic · IBM watsonx · Inference.net · Inworld · Jina AI · Kilo Gateway · Lambda AI · LaoZhang · Linkup Search · LlamaGate · Maritalk · Modal · Moonshot AI · Morph · Muse Spark · NanoBanana · NanoGPT · NLP Cloud · Nous Research · Novita AI · nScale · OCI · Ollama Cloud · OVHcloud · PiAPI · PlayHT · Poe · Predibase · PublicAI · Qwen Code · Recraft · Reka · Runway · SAP · Scaleway · SearchAPI · SearXNG · Serper · Stability AI · Synthetic · Tavily · TheB.AI · Topaz · Upstage · v0 (Vercel) · Vercel AI Gateway · Volcengine · Voyage AI · W&B Inference · Xiaomi MiMo · You.com · Z.AI · + OpenAI/Anthropic-compatible custom endpoints
| LM Studio | Ollama | vLLM | Llamafile | Docker Model Runner |
| NVIDIA Triton | XInference | oobabooga | ComfyUI | SD WebUI |
┌─────────────┐
│ Your CLI │ (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
│ Tool │
└──────┬──────┘
│ http://localhost:20128/v1
↓
┌──────────────────────────────────────────────────┐
│ OmniRoute (Smart Router) │
│ • 🗜️ Prompt Compression (save 15-95% eligible) │
│ • Format translation (OpenAI ↔ Claude ↔ Gemini) │
│ • Quota tracking + Embeddings + Images │
│ • Auto token refresh + Rate limit management │
└──────┬───────────────────────────────────────────┘
│
├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI
│ ↓ quota exhausted
├─→ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc.
│ ↓ budget limit
├─→ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
│ ↓ budget limit
└─→ [Tier 4: FREE] Qoder, Qwen, Kiro (unlimited)
Result: Never stop coding, minimal cost + 15-95% eligible token savings
Why use many token when few token do trick? OmniRoute's built-in compression pipeline reduces token usage before requests reach the provider. It combines ideas from RTK - Rust Token Killer and Caveman (⭐ 51K+).
Every request passes through the compression pipeline transparently — no client changes needed:
┌──────────────────┐ ┌─────────────────────────────┐ ┌──────────────┐
│ Client sends │────▶│ OmniRoute Compression │────▶│ Provider │
│ full prompt │ │ Pipeline (7 options) │ │ receives │
│ (10,000 tok) │ │ │ │ compressed │
│ │ │ 🪶 Lite ........... ~15% │ │ (~1,080 tok)│
│ │ │ 🪨 Standard ....... ~30% │ │ │
│ │ │ ⚡ Aggressive ..... ~50% │ │ 💰 up to 95%│
│ │ │ 🔥 Ultra .......... ~75% │ │ │
│ │ │ 🧰 RTK ............ 60-90% │ │ │
│ │ │ 🔗 Stacked ........ 78-95% │ │ │
└──────────────────┘ └─────────────────────────────┘ └──────────────┘
| Mode | Savings | Technique | Best For |
|---|---|---|---|
| Off | 0% | No compression | When you need exact prompts |
| 🪶 Lite | ~15% | Whitespace collapse, dedup system prompts, image URL shortening | Always-on safe default |
| 🪨 Standard (Caveman) | ~30% | 30+ regex rules: filler removal, context condensation, structural compression, multi-turn dedup | Daily coding with Claude/Codex |
| ⚡ Aggressive | ~50% | All standard + progressive message aging + tool result summarization + LLM-based compression | Long sessions with many tool calls |
| 🔥 Ultra | ~75% | All aggressive + heuristic token pruning + stopword removal + score-based filtering | Maximum savings when tokens are scarce |
| 🧰 RTK | 60-90% | 49 command-aware filters, RTK-style JSON DSL, verify gate, trust-gated custom filters | Shell/test/build/git output in agents |
| 🔗 Stacked | 78-95% | RTK first, then Caveman input condensation; ~89% with upstream average math | Mixed prompts with tool logs + prose |
These numbers are based on the upstream project READMEs under _references/_outros:
| Source | Upstream claim used by OmniRoute docs |
|---|---|
| Caveman | ~75% fewer output tokens; benchmark average 65% output savings, range 22-87%; ~46% input compression tool |
| RTK | 60-90% command-output token savings; sample session ~118,000 -> ~23,900 tokens, which is 79.7% saved (~80%) |
For the default stacked compression combo, OmniRoute runs:
RTK -> CavemanWhen both engines can act on the same tool/context payload, the savings compound:
combined = 1 - (1 - RTK savings) * (1 - Caveman input savings)
average = 1 - (1 - 0.80) * (1 - 0.46) = 89.2%
range = 1 - (1 - 0.60..0.90) * (1 - 0.46) = 78.4-94.6%Caveman output mode is separate from prompt compression. When enabled for responses, use Caveman's
own upstream output numbers: 65% average, ~75% headline, 22-87% observed range. Total bill
savings depend on the prompt/output mix, but coding-agent sessions are often tool-context heavy, so
the RTK -> Caveman combo is the best default for maximum context savings.
🗣️ Before compression (69 tokens):
"The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I would recommend using useMemo to memoize the object."
🪨 After compression (19 tokens):
"New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo."
Same answer. 72% less tokens. Zero accuracy loss.
Request Body
│
├─ strategySelector.ts ─── Picks mode (config / combo override / auto-trigger)
│
├─ lite.ts ─────────────── Whitespace, dedup, image URLs, redundant content
├─ caveman.ts ──────────── 30+ regex rules via cavemanRules.ts
│ └─ preservation.ts ─── Protects code blocks, URLs, JSON from compression
├─ engines/rtk/ ────────── Command detection + JSON DSL filters + raw-output recovery
├─ engines/registry.ts ─── Shared engine registry for caveman, RTK, and stacked
├─ aggressive.ts ───────── Summarizer + tool result compressor + progressive aging
│ ├─ summarizer.ts ───── Rule-based message summarization
│ ├─ toolResultCompressor.ts ── file/grep/shell/JSON/error compression
│ └─ progressiveAging.ts ──── Older messages → shorter summaries
└─ ultra.ts ────────────── Heuristic token scoring + pruning
└─ ultraHeuristic.ts ─ Stopword detection, score thresholds, force-preserve
Dashboard → Context & Cache → Caveman / RTK / Compression Combos
Or per-combo override:
{
"comboOverrides": {
"my-coding-combo": "standard",
"my-cheap-combo": "ultra"
}
}Auto-trigger: set autoTriggerTokens to automatically enable compression when a request exceeds a token threshold.
Compression combos can also assign a named compression pipeline to routing combos, so a coding combo can use RTK + Caveman while a paid subscription combo stays on lite mode.
🪨 Fun fact: The standard/caveman mode is inspired by Caveman — the viral project that reports 65% average output-token savings while keeping technical accuracy. OmniRoute takes this further with a 7-option pipeline and a default
RTK -> Cavemancombo that can reach ~89% average savings on eligible tool/context payloads.
📖 Full compression documentation: docs/COMPRESSION_GUIDE.md • docs/RTK_COMPRESSION.md • docs/COMPRESSION_ENGINES.md • docs/COMPRESSION_RULES_FORMAT.md • docs/COMPRESSION_LANGUAGE_PACKS.md
Every developer using AI tools faces these problems daily. OmniRoute solves them all.
| # | Problem | OmniRoute Solution |
|---|---|---|
| 💸 | Subscription quota expires mid-coding | Smart 4-Tier Fallback — auto-routes Subscription → API Key → Cheap → Free |
| 🔌 | Each provider has a different API format | Format Translation — unified endpoint translates OpenAI ↔ Claude ↔ Gemini ↔ Responses |
| 🌐 | AI providers block my country/region | 3-Level Proxy — global, per-provider, and per-key proxy with TLS fingerprint spoofing |
| 🆓 | Can't afford AI subscriptions | 11 Free Providers — Kiro, Qoder, Pollinations, LongCat, Cloudflare AI, NVIDIA NIM... |
| 🔒 | Gateway is exposed without protection | API Key Management — scoping, rotation, IP filtering, rate limiting, prompt injection guard |
| 🛑 | Provider went down, lost coding flow | Circuit Breakers — auto-failover with cooldown, retry, anti-thundering herd |
| 🔧 | Configuring each CLI tool is tedious | CLI Tools Dashboard — one-click setup for Claude Code, Codex, Cursor, OpenClaw, Kilo |
| 🔑 | Managing OAuth tokens is hell | Auto Token Refresh — OAuth PKCE for 8 providers, multi-account, LAN/remote fix |
| 📊 | Don't know how much I'm spending | Cost Analytics — per-token tracking, budget limits, usage stats per API key |
| 🐛 | Can't diagnose errors in AI calls | Unified Logs — 4-tab dashboard (request, proxy, audit, console) + p50/p95/p99 telemetry |
📖 See all 31 problems OmniRoute solves
| # | Problem | Solution |
|---|---|---|
| 11 | Deploying/maintaining is complex | npm global, Docker multi-arch, Electron, Termux — deploy anywhere |
| 12 | Interface is English-only | 40+ languages with RTL support |
| 13 | Need more than chat (images, audio, video) | 10 multi-modal APIs: embeddings, images, video, music, TTS, STT, moderation, rerank, search, batch |
| 14 | No way to test/compare models | LLM Evals, Translator Playground, Chat Tester, Live Monitor |
| 15 | Need to scale without losing performance | Semantic cache, request dedup, rate limit detection, queue & pacing |
| 16 | Want to control model behavior globally | System prompt injection, thinking budget, wildcard routing |
| 17 | Need MCP tools as first-class features | 29 MCP tools, 3 transports (stdio/SSE/HTTP), 10 scopes, audit trail |
| 18 | Need A2A orchestration | JSON-RPC 2.0 + SSE streaming, task lifecycle, sync + stream paths |
| 19 | Need real MCP process health | Runtime heartbeat, PID tracking, UI status cards |
| 20 | Need auditable MCP execution | SQLite-backed audit with filters, pagination, stats |
| 21 | Need scoped MCP permissions | 10 granular scopes per integration |
| 22 | Need operational controls without redeploying | Combo switches, resilience tuning, breaker resets from dashboard |
| 23 | Need A2A task lifecycle visibility | Task listing/filtering, drill-down, cancellation |
| 24 | Need active stream metrics | Active stream counters, per-state counts, A2A dashboard cards |
| 25 | Need standard agent discovery | Agent Card at /.well-known/agent.json |
| 26 | Need protocol discoverability | Consolidated Endpoints page with Proxy, MCP, A2A, API tabs |
| 27 | Need E2E protocol validation | Real MCP SDK + A2A client flows in test:protocols:e2e |
| 28 | Need unified observability | Health + audit + telemetry across OpenAI, MCP, and A2A layers |
| 29 | Need one runtime for proxy + tools + agents | OpenAI proxy + MCP + A2A in one stack with shared auth/resilience |
| 30 | Need agentic workflows without glue-code | Unified endpoint, protocol UIs, production-ready foundations |
| 31 | Long sessions crash with context limits | Proactive context compression, structural integrity guards, multi-layer dropping |
📖 Deep dives: Resilience Guide • Proxy Guide • Setup Guide • Compression Guide
Setup AI coding in minutes at $0/month. Connect these free accounts and use the built-in Free Stack combo.
| Step | Action | Providers Unlocked |
|---|---|---|
| 1 | Connect Kiro (AWS Builder ID OAuth) | Claude Sonnet 4.5, Haiku 4.5 — unlimited |
| 2 | Connect Qoder (Google OAuth) | kimi-k2-thinking, qwen3-coder-plus, deepseek-r1... — unlimited |
| 3 | Connect Qwen (Device Code) | qwen3-coder-plus, qwen3-coder-flash... — unlimited |
| 4 | Connect Gemini CLI (Google OAuth) | gemini-3-flash, gemini-2.5-pro — 180K/mo free |
| 5 | /dashboard/combos → Free Stack ($0) template |
Round-robin all free providers automatically |
Point any IDE/CLI to: http://localhost:20128/v1 · API Key: any-string · Done.
Optional extra coverage (also free): Groq API key (30 RPM free), NVIDIA NIM (40 RPM free, 70+ models), Cerebras (1M tok/day), LongCat API key (50M tokens/day!), Cloudflare Workers AI (10K Neurons/day, 50+ models).
npm install -g omniroute
omnirouteDashboard opens at http://localhost:20128 · API at http://localhost:20128/v1.
- Dashboard → Providers → connect at least one provider (OAuth or API key)
- Dashboard → Endpoints → create an API key
- Dashboard → Combos → set your fallback chain (optional)
Base URL: http://localhost:20128/v1
API Key: [copy from Endpoint page]
Model: if/kimi-k2-thinking (or any provider/model)Works with Claude Code, Codex CLI, Gemini CLI, Cursor, Cline, OpenClaw, OpenCode, and any OpenAI-compatible tool.
📦 More install methods (Docker, source, Arch, Void, pnpm)
Docker:
docker run -d --name omniroute --restart unless-stopped -p 20128:20128 -v omniroute-data:/app/data diegosouzapw/omniroute:latestFrom source:
cp .env.example .env && npm install
PORT=20128 DASHBOARD_PORT=20129 NEXT_PUBLIC_BASE_URL=http://localhost:20129 npm run devpnpm: pnpm install -g omniroute && pnpm approve-builds -g && omniroute
Arch Linux (AUR): yay -S omniroute-bin && systemctl --user enable --now omniroute.service
MCP: omniroute --mcp (stdio transport)
CLI options: omniroute --port 3000, omniroute --no-open, omniroute --help
Split-port mode: PORT=20128 DASHBOARD_PORT=20129 omniroute
Uninstall: npm run uninstall (keeps data) or npm run uninstall:full (removes everything)
📖 Full details: Setup Guide · Docker · Void Linux template
OmniRoute is available as a public Docker image on Docker Hub.
Quick run:
docker run -d \
--name omniroute \
--restart unless-stopped \
--stop-timeout 40 \
-p 20128:20128 \
-v omniroute-data:/app/data \
diegosouzapw/omniroute:latestWith environment file:
# Copy and edit .env first
cp .env.example .env
docker run -d \
--name omniroute \
--restart unless-stopped \
--stop-timeout 40 \
--env-file .env \
-p 20128:20128 \
-v omniroute-data:/app/data \
diegosouzapw/omniroute:latestUsing Docker Compose:
# Base profile (no CLI tools)
docker compose --profile base up -d
# CLI profile (Claude Code, Codex, OpenClaw built-in)
docker compose --profile cli up -dDashboard support for Docker deployments now includes a one-click Cloudflare Quick Tunnel on Dashboard → Endpoints. The first enable downloads cloudflared only when needed, starts a temporary tunnel to your current /v1 endpoint, and shows the generated https://*.trycloudflare.com/v1 URL directly below your normal public URL. Endpoint tunnel panels, including Cloudflare, Tailscale, and ngrok, can be shown or hidden from Settings → Appearance without changing active tunnel state.
Notes:
- Quick Tunnel URLs are temporary and change after every restart.
- Quick Tunnels are not auto-restored after an OmniRoute or container restart. Re-enable them from the dashboard when needed.
- Managed install currently supports Linux, macOS, and Windows on
x64/arm64. - Managed Quick Tunnels default to HTTP/2 transport to avoid noisy QUIC UDP buffer warnings in constrained container environments. Set
CLOUDFLARED_PROTOCOL=quicorautoif you want a different transport. - Docker images bundle system CA roots and pass them to managed
cloudflared, which avoids TLS trust failures when the tunnel bootstraps inside the container. - SQLite runs in WAL mode.
docker stopshould be allowed to finish so OmniRoute can checkpoint the latest changes back intostorage.sqlite. - The bundled Compose files already set a 40s stop grace period. If you run the image directly, keep
--stop-timeout 40(or similar) so manual stops do not cut off shutdown cleanup. - Set
CLOUDFLARED_BIN=/absolute/path/to/cloudflaredif you want OmniRoute to use an existing binary instead of downloading one.
Using Docker Compose with Caddy (HTTPS Auto-TLS):
OmniRoute can be securely exposed using Caddy's automatic SSL provisioning. Ensure your domain's DNS A record points to your server's IP.
services:
omniroute:
image: diegosouzapw/omniroute:latest
container_name: omniroute
restart: unless-stopped
volumes:
- omniroute-data:/app/data
environment:
- PORT=20128
- NEXT_PUBLIC_BASE_URL=https://your-domain.com
caddy:
image: caddy:latest
container_name: caddy
restart: unless-stopped
ports:
- "80:80"
- "443:443"
command: caddy reverse-proxy --from https://your-domain.com --to http://omniroute:20128
volumes:
omniroute-data:| Image | Tag | Size | Description |
|---|---|---|---|
diegosouzapw/omniroute |
latest |
~250MB | Latest stable release |
diegosouzapw/omniroute |
3.7.8 |
~250MB | Current version |
📖 Full Docker documentation: docs/DOCKER_GUIDE.md — Compose profiles, Caddy HTTPS, Cloudflare tunnels, and more.
OmniRoute runs on Web, Desktop (Electron), Android (Termux), and as a Progressive Web App (PWA).
| Platform | Install | Highlights |
|---|---|---|
| 🖥️ Desktop | npm run electron:build |
Native window, system tray, auto-start, offline mode — Windows/macOS/Linux |
| 📱 Android | pkg install nodejs-lts && npx -y omniroute |
ARM native, no root, 24/7 via Termux:Boot — your phone is an AI server |
| 📲 PWA | "Add to Home Screen" in browser | Fullscreen, offline page, service worker caching — Android/iOS/Desktop |
🖥️ Desktop App details
- Native Electron app with system tray, auto-start, native notifications
- One-click install: NSIS (Windows), DMG (macOS), AppImage (Linux)
- Dev:
npm run electron:dev· Build:npm run electron:build - 📖 Full docs:
electron/README.md
📱 Android (Termux) details
pkg update && pkg install nodejs-lts python build-essential git
npx -y omniroute@latestAccess from any device on the same network: http://PHONE_IP:20128/v1
- 📖 Full guide:
docs/TERMUX_GUIDE.md
📲 PWA details
- Android (Chrome): ⋮ → "Add to Home screen"
- iOS (Safari): Share → "Add to Home Screen"
- Desktop (Chrome/Edge): Install icon in address bar
- 📖 Full docs:
docs/PWA_GUIDE.md
🇷🇺 🇨🇳 🇮🇷 🇨🇺 🇹🇷 In Russia, China, Iran, or any blocked region? OmniRoute's 3-level proxy system solves this completely.
| Level | Badge | Configure In | Use Case |
|---|---|---|---|
| Global | 🟢 | Settings → Proxy | All traffic through one proxy |
| Per-Provider | 🟡 | Provider → Proxy | Only specific providers proxied |
| Per-Connection | 🔵 | Connection → Proxy | Each API key uses its own proxy |
What gets proxied: API requests ✅ • OAuth flows ✅ • Connection tests ✅ • Token refresh ✅ • Model sync ✅
Protocols: HTTP/HTTPS, SOCKS5 (ENABLE_SOCKS5_PROXY=true), Authenticated proxies
No proxy? Use the built-in 1proxy integration for hundreds of free, validated proxies worldwide:
- One-click sync (up to 500 proxies) • Quality scores (0-100) • Country filter • Auto-rotation (quality/random/sequential) • Auto-degradation • Circuit breaker
- 🔒 TLS Fingerprint Spoofing — browser-like TLS via
wreq-js - 🔏 CLI Fingerprint Matching — matches native CLI binary signatures
- 🏠 Proxy IP Preservation — stealth + IP masking simultaneously
📖 Full proxy documentation: docs/PROXY_GUIDE.md
| Tier | Provider | Cost | Quota Reset | Best For |
|---|---|---|---|---|
| 💳 SUBSCRIPTION | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
| Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users | |
| Gemini CLI | FREE | 180K/mo + 1K/day | Everyone! | |
| GitHub Copilot | $10-19/mo | Monthly | GitHub users | |
| 🔑 API KEY | NVIDIA NIM | FREE (dev forever) | ~40 RPM | 70+ open models |
| Cerebras | FREE (1M tok/day) | 60K TPM / 30 RPM | World's fastest | |
| Groq | FREE (30 RPM) | 14.4K RPD | Ultra-fast Llama/Gemma | |
| DeepSeek V3.2 | $0.27/$1.10 per 1M | None | Best price/quality reasoning | |
| xAI Grok-4 Fast | $0.20/$0.50 per 1M 🆕 | None | Fastest + tool calling, ultralow | |
| xAI Grok-4 (standard) | $0.20/$1.50 per 1M 🆕 | None | Reasoning flagship from xAI | |
| Mistral | Free trial + paid | Rate limited | European AI | |
| OpenRouter | Pay-per-use | None | 100+ models aggr. | |
| AgentRouter 🆕 | Pay-per-use | None | $200 free credits at signup | |
| 💰 CHEAP | GLM-5 (via Z.AI) 🆕 | $0.5/1M | Daily 10AM | 128K output, newest flagship |
| GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup | |
| MiniMax M2.5 🆕 | $0.3/1M input | 5-hour rolling | Reasoning + agentic tasks | |
| MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option | |
| Kimi K2.5 (Moonshot API) 🆕 | Pay-per-use | None | Direct Moonshot API access | |
| Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost | |
| 🆓 FREE | Qoder | $0 | Unlimited | 5 models unlimited |
| Qwen | $0 | Unlimited | 4 models unlimited | |
| Kiro | $0 | Unlimited | Claude Sonnet/Haiku (AWS Builder) | |
| LongCat Flash-Lite 🆕 | $0 (50M tok/day 🔥) | 1 RPS | Largest free quota on Earth | |
| Pollinations AI 🆕 | $0 (no key needed) | 1 req/15s | GPT-5, Claude, DeepSeek, Llama 4 | |
| Cloudflare Workers AI 🆕 | $0 (10K Neurons/day) | ~150 resp/day | 50+ models, global edge | |
| Scaleway AI 🆕 | $0 (1M tokens total) | Rate limited | EU/GDPR, Qwen3 235B, Llama 70B |
🆕 New models added (Mar 2026): Grok-4 Fast family at $0.20/$0.50/M (benchmarked at 1143ms — 30% faster than Gemini 2.5 Flash), GLM-5 via Z.AI with 128K output, MiniMax M2.5 reasoning, DeepSeek V3.2 updated pricing, Kimi K2.5 via Moonshot direct API.
💡 See the full $0 Free Stack (11 providers) below.
💡 Understanding Dashboard Costs:
The "cost" displayed in the Usage Analytics page is for tracking and comparison purposes only. OmniRoute itself never charges you anything — it's free, open-source software running on your machine. If your dashboard shows "$290 total cost" while using free models, that's how much you saved compared to paid API pricing. Think of it as a savings tracker, not a bill.
Combine all free providers into one unbreakable combo — OmniRoute auto-routes between them when quota runs out.
| Provider | Prefix | Free Models | Quota |
|---|---|---|---|
| Kiro | kr/ |
Claude Sonnet 4.5, Haiku 4.5, Opus 4.6 | 50 CREDITS per month |
| Qoder | if/ |
kimi-k2-thinking, qwen3-coder-plus, deepseek-r1, minimax-m2.1 | ♾️ Unlimited |
| Qwen | qw/ |
qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next | ♾️ Unlimited |
| Pollinations | pol/ |
GPT-5, Claude, Gemini, DeepSeek, Llama 4, Mistral | No key needed |
| LongCat | lc/ |
LongCat-Flash-Lite | 50M tokens/day 🔥 |
| Gemini CLI | gc/ |
gemini-3-flash, gemini-2.5-pro | 180K tok/mo |
| Cloudflare AI | cf/ |
50+ models (Llama, Gemma, Mistral, Whisper) | 10K Neurons/day |
| Groq | groq/ |
Llama 3.3 70B, Qwen3 32B, Kimi K2 | 14.4K RPD |
| NVIDIA NIM | nvidia/ |
129 models (DeepSeek, Llama, GLM, Kimi) | ~40 RPM |
| Cerebras | cerebras/ |
Qwen3 235B, GPT-OSS 120B, Llama 3.1 | 1M tok/day |
| Scaleway | scw/ |
Qwen3 235B, Llama 70B, DeepSeek V3 | 1M tokens (EU) |
📖 25+ more free providers — Groq, Cerebras, Mistral, GitHub Models, OpenRouter, and more
Also free (API Key required):
Mistral (1B tok/month) · OpenRouter (35+ :free models) · GitHub Models (GPT-5, 45+ models) ·
Cohere (1K calls/month) · Z.AI/GLM (permanent free Flash models) · SiliconFlow (1K RPM, 50K TPM) ·
Kilo Code (~200 req/hr auto-router) · HuggingFace ($0.10/mo credits) · Ollama Cloud (400+ models) ·
LLM7.io (30+ models) · Kluster AI · IBM watsonx (300K tok/month) · OpenCode Zen · Vercel AI Gateway ($5/mo)
Trial credits (one-time): Baseten ($30) · NLP Cloud ($15) · AI21 ($10) · Upstage ($10) · SambaNova ($5) · Modal ($5/mo) · Fireworks ($1) · Nebius ($1) · Inference.net ($1 + $25 survey) · Hyperbolic ($1) · Novita ($0.50)
China-based (free tiers): ModelScope · Tencent Hunyuan · Volcengine · ChatAnywhere · InternAI · Bigmodel
Combined capacity: ~31,000+ RPD · ~32B+ tokens/month · 500+ models · $0
📖 Complete free provider directory: docs/FREE_TIERS.md — 25+ providers, quotas, base URLs, model tables, and OmniRoute combo setup.
Transcribe any audio/video for $0 — Deepgram leads with $200 free, AssemblyAI $50 fallback, Groq Whisper as unlimited emergency backup.
| Provider | Free Credits | Best Model | Rate Limit |
|---|---|---|---|
| 🟢 Deepgram | $200 free (signup) | nova-3 — best accuracy, 30+ languages |
No RPM limit on free credits |
| 🔵 AssemblyAI | $50 free (signup) | universal-3-pro — chapters, sentiment, PII |
No RPM limit on free credits |
| 🔴 Groq | Free forever | whisper-large-v3 — OpenAI Whisper |
30 RPM (rate limited) |
Suggested combo in /dashboard/combos:
Name: free-transcription
Strategy: Priority
Nodes:
[1] deepgram/nova-3 → uses $200 free first
[2] assemblyai/universal-3-pro → fallback when Deepgram credits run out
[3] groq/whisper-large-v3 → free forever, emergency fallback
Then in /dashboard/media → Transcription tab: upload any audio or video file → select your combo endpoint → get transcription in supported formats.
4,690+ automated tests across 517 test files. Not just a relay — a full operational platform.
| Feature | Why It Matters |
|---|---|
| 🧠 Smart 4-Tier Fallback — Subscription → API → Cheap → Free | Never stop coding, zero downtime |
| 🔄 Format Translation — OpenAI ↔ Claude ↔ Gemini ↔ Responses API | Works with ANY CLI tool |
| 🗜️ Prompt Compression — 7 options including Caveman, RTK, and stacked pipelines | Save 15-95% eligible tokens |
| 🤖 MCP Server — 37 tools, 3 transports (stdio/SSE/HTTP), 10 scopes | IDE/agent tool integration |
| 🛡️ Resilience Engine — circuit breakers, cooldowns, TLS spoofing, anti-thundering herd | Auto-recovery from any failure |
| 🎵 10 Multi-Modal APIs — chat, embed, images, video, music, TTS, STT, moderation, rerank, search | One endpoint for everything |
| 🌍 3-Level Proxy — global, per-provider, per-key + 1proxy free marketplace | Access AI from any country |
| 📊 Full Observability — unified logs, p50/p95/p99 telemetry, cost tracking, budget controls | Know exactly what's happening |
📋 Complete feature list — 30+ capabilities
Routing & Intelligence
- 13 balancing strategies (priority, weighted, round-robin, P2C, cost-optimized, context-relay...)
- Task-aware smart routing (coding/vision/analysis) · Context relay session handoffs
- Thinking budget controls (passthrough/auto/custom) · Wildcard routing · System prompt injection
Translation & Compatibility
- Auto token refresh (OAuth PKCE for 8 providers) · Multi-account round-robin
- Responses API — full
/v1/responsesfor Codex · Batch API with Files API - OpenAPI 3.0 live spec + Try-It UI
Protocols
- A2A Server — JSON-RPC 2.0, SSE streaming, task lifecycle, skills
- ACP — CLI agent discovery (14 agents + custom)
Platform
- Desktop (Electron) · Android (Termux) · PWA · Docker (AMD64 + ARM64)
- Cloudflare / Tailscale / ngrok tunnels · 40+ languages with RTL
- Semantic + signature cache (two-tier) · Request idempotency + deduplication
Observability
- Health dashboard — uptime, breakers, cache, lockouts
- Evaluation framework — golden set testing · Webhooks · Compliance audit
v3.6+ Highlights: V1 WebSocket Bridge · Sync Tokens & Config Bundle · GLM Thinking (glmt) · Hybrid Token Counting · Safe Outbound Fetch · Wait For Cooldown · Runtime Env Validation · Vision Bridge · Grok-4 Fast · GLM-5 via Z.AI · MiniMax M2.5 · toolCalling flag · Multilingual Intent Detection · Benchmark-Driven Fallbacks · Request Deduplication
Architecture Examples:
Combo: "my-coding-stack" Format Translation:
1. cc/claude-opus-4-7 CLI → OpenAI format
2. nvidia/llama-3.3-70b OmniRoute → translates
3. glm/glm-4.7 Provider → native format
4. if/kimi-k2-thinking📖 MCP Server README · A2A Server README · Resilience Guide · Features Gallery
Problem: Quota expires unused, rate limits during heavy coding sessions.
Combo: "maximize-claude"
1. cc/claude-opus-4-7 (use subscription fully)
2. glm/glm-5.1 (cheap backup when quota out — $0.5/1M)
3. kr/claude-sonnet-4.5 (free emergency fallback via Kiro)
Compression: standard (caveman) — saves 30% tokens = stretch quota further
Monthly cost: $20 (subscription) + ~$3 (backup) = $23 total
vs. $20 + hitting limits + lost productivity = frustration
Problem: Can't afford subscriptions, need reliable AI for coding.
Combo: "free-forever"
1. kr/claude-sonnet-4.5 (Claude 4.5 free unlimited via Kiro)
2. if/kimi-k2-thinking (reasoning model free via Qoder)
3. pol/gpt-5 (GPT-5 free via Pollinations — no key)
4. lc/longcat-flash-lite (50M tokens/day free backup)
Compression: aggressive — saves 50% tokens = double your free quota
Monthly cost: $0
Quality: Production-ready models + 50% token savings
Problem: Deadlines, can't afford any downtime.
Combo: "always-on"
1. cc/claude-opus-4-7 (best quality — subscription)
2. cx/gpt-5.5 (second subscription — OpenAI)
3. glm/glm-5.1 (cheap, resets daily — $0.5/1M)
4. minimax/MiniMax-M2.5 (cheapest paid — $0.3/1M)
5. kr/claude-sonnet-4.5 (free unlimited — never fails)
Compression: lite — saves 15% tokens passively, zero risk
Result: 5 layers of fallback = zero downtime
Monthly cost: $20-200 (subscriptions) + $5-10 (backup)
Problem: AI providers block my country, VPNs are slow.
Combo: "unblocked-ai"
1. kr/claude-sonnet-4.5 (free via Kiro + proxy)
2. pol/deepseek-r1 (Pollinations — no geo-block)
3. groq/llama-3.3-70b (Groq + proxy)
Proxy: Global proxy set in Settings → or per-provider proxy override
Result: Access ALL providers from ANY country
Monthly cost: $0 (free providers) + $0 (1proxy free marketplace)
Problem: Token costs are eating my budget, need to squeeze every token.
Combo: "ultra-saver"
1. cc/claude-opus-4-7 (subscription — best quality)
2. glm/glm-5.1 (cheap backup)
Compression: ultra — saves 75% tokens
Result: 10K token prompt → 2.5K tokens sent
Montly savings: ~$150-300/month in token costs for heavy users
OmniRoute includes a built-in evaluation framework to test LLM response quality against a golden set. Access it via Analytics → Evals in the dashboard.
The pre-loaded "OmniRoute Golden Set" contains test cases for:
- Greetings, math, geography, code generation
- JSON format compliance, translation, markdown generation
- Safety refusal (harmful content), counting, boolean logic
| Strategy | Description | Example |
|---|---|---|
exact |
Output must match exactly | "4" |
contains |
Output must contain substring (case-insensitive) | "Paris" |
regex |
Output must match regex pattern | "1.*2.*3" |
custom |
Custom JS function returns true/false | (output) => output.length > 10 |
Point any OpenAI-compatible tool to OmniRoute:
Base URL: http://localhost:20128/v1
API Key: [from Dashboard → Endpoints]| Tool | Config Location |
|---|---|
| Claude Code | claude mcp add-server omniroute --type http --url http://localhost:20128/api/mcp/stream |
| Codex CLI | OPENAI_BASE_URL=http://localhost:20128/v1 OPENAI_API_KEY=your-key codex |
| Cursor | Settings → Models → Add Model → Override Base URL |
| Cline | Extension settings → Custom API Base URL |
| OpenClaw | OPENAI_BASE_URL=http://localhost:20128/v1 openclaw |
| Gemini CLI | Uses native OAuth via OmniRoute — connect in Providers |
# MCP (stdio transport)
omniroute --mcp
# A2A (JSON-RPC 2.0)
curl http://localhost:20128/.well-known/agent.json| Variable | Default | Purpose |
|---|---|---|
PORT |
20128 |
API and dashboard port |
DASHBOARD_PORT |
— | Separate dashboard port (split-port mode) |
REQUIRE_API_KEY |
false |
Require API key for all requests |
DATA_DIR |
~/.omniroute |
Database and config storage |
REQUEST_TIMEOUT_MS |
600000 |
Upstream response timeout |
📖 Full Setup Guide — All CLI tools, protocols, and environment variables
📖 Complete documentation:
- User Guide — Providers, combos, CLI integration
- API Reference — All endpoints with examples
- MCP Server — 37 tools, IDE configs
- A2A Server — JSON-RPC, skills, streaming
- Environment Config — Complete
.envreference - VM Deployment — VM + nginx + Cloudflare
📊 Why does my dashboard show high costs if I'm using free models?
The dashboard tracks your token usage and displays estimated costs as if you were using paid APIs directly. This is not actual billing — it's a reference to show how much you're saving.
Example:
- Dashboard shows: "$290 total cost"
- Reality: You're using Kiro + Qoder (FREE unlimited)
- Your actual cost: $0.00
- What $290 means: Amount you saved by using free models instead of paid APIs!
The cost display is a "savings tracker" to help you understand your usage patterns and optimization opportunities.
💳 Will I be charged by OmniRoute?
No. OmniRoute is free, open-source software that runs on your own computer. It never charges you anything.
You only pay:
- ✅ Subscription providers (Claude Code $20/mo, Codex $20-200/mo) → Pay them directly on their websites
- ✅ API key providers (DeepSeek, xAI, etc.) → Pay them directly, OmniRoute just routes your requests
- ❌ OmniRoute itself → Never charges anything, ever
OmniRoute is a local proxy/router. It doesn't have your credit card, can't send invoices, and has no billing system. It's completely free software.
🆓 Are FREE providers really unlimited?
Yes! The current FREE providers are genuinely free with no hidden charges:
- Kiro AI: Free unlimited Claude Sonnet/Haiku via AWS Builder ID / Google / GitHub OAuth
- Qoder: Free unlimited kimi-k2-thinking, qwen3-coder-plus, deepseek-r1 via PAT token
- Pollinations AI: No API key needed — GPT-5, Claude, DeepSeek, Llama 4
- LongCat Flash-Lite: 50M tokens/day — largest free quota available
- Cloudflare Workers AI: 10K Neurons/day — 50+ models at the edge
OmniRoute just routes your requests to them — there's no "catch" or future billing.
💰 How do I minimize my actual AI costs?
Free-First Strategy:
-
Start with 100% free combo:
1. kr/claude-sonnet-4.5 (Kiro — unlimited free) 2. if/kimi-k2-thinking (Qoder — unlimited free) 3. pol/gpt-5 (Pollinations — no key needed)Cost: $0/month
-
Enable Prompt Compression — even
litemode saves ~15% passively -
Add cheap backup only if you need it:
4. glm/glm-5.1 ($0.5/1M tokens)Additional cost: Only pay for what you actually use
-
Use subscription providers last — only if you already have them. OmniRoute helps maximize their value through quota tracking.
Result: Most users can operate at $0/month using only free tiers!
🗜️ Will compression affect response quality?
No. Compression only affects the input (your prompt), not the model's response. Each mode has been designed to preserve technical accuracy:
- Lite (~15%): Only whitespace/formatting — zero semantic change
- Standard (~30%): Removes filler words ("please", "I think", "basically") — same meaning
- Aggressive (~50%): Summarizes old messages + compresses tool outputs — core context preserved
- Ultra (~75%): Heuristic pruning — use only when token budget is critical
Code blocks, URLs, JSON, and structured data are always protected from compression via the preservation engine.
🌍 Does OmniRoute work in countries where AI is blocked?
Yes! OmniRoute has a 3-level proxy system:
- Global proxy — all requests go through your proxy
- Per-provider proxy — different proxy per provider
- Per-API-key proxy — different proxy per key
Plus the 1proxy free marketplace for community-shared proxies. Users in Russia, China, Iran, and other restricted regions can access all 160+ providers through OmniRoute's proxy infrastructure.
See the Proxy Guide for setup instructions.
| Problem | Quick Fix |
|---|---|
| "Language model did not provide messages" | Provider quota exhausted → check quota tracker, use combo fallback |
| Rate limiting (429) | Add fallback combo: cc/claude → glm/glm-4.7 → if/kimi-k2-thinking |
| OAuth token expired | Auto-refreshed by OmniRoute. If stuck: delete + re-auth in Providers |
unsupported_country_region_territory |
Configure proxy in Settings → Proxy (see Proxy Guide) |
| Docker SQLite locks | Use --stop-timeout 40 for clean WAL checkpoint on shutdown |
| Node.js runtime errors | Use Node.js >=20.20.2 <21, >=22.22.2 <23, or >=24.0.0 <25 (24 LTS recommended) |
system-info for bug reports |
Run npm run system-info and attach system-info.txt to your issue |
📖 Full troubleshooting guide: docs/TROUBLESHOOTING.md
Click to expand tech stack details
- Runtime: Node.js 20.20.2+, 22.22.2+, or 24.x LTS (24 LTS recommended)
- Language: TypeScript 5.9 — 100% TypeScript across
src/andopen-sse/(zeroanyin core modules since v2.0) - Framework: Next.js 16 + React 19 + Tailwind CSS 4
- Database: better-sqlite3 (SQLite) + LowDB (JSON legacy) — domain state, proxy logs, MCP audit, routing decisions, memory, skills
- Schemas: Zod (MCP tool I/O validation, API contracts)
- Protocols: MCP (stdio/HTTP) + A2A v0.3 (JSON-RPC 2.0 + SSE)
- Streaming: Server-Sent Events (SSE) + WebSocket bridge (
/v1/ws) - Auth: OAuth 2.0 (PKCE) + JWT + API Keys + MCP Scoped Authorization
- Testing: Node.js test runner + Vitest (4,690+ test cases across 517 files — unit, integration, E2E, security, ecosystem)
- Platforms: Desktop (Electron), Android (Termux), PWA (any browser)
- CI/CD: GitHub Actions (auto npm publish + Docker Hub on release)
- Website: omniroute.online
- Package: npmjs.com/package/omniroute
- Docker: hub.docker.com/r/diegosouzapw/omniroute
- Resilience: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing, auto-combo self-healing
| Document | Description |
|---|---|
| User Guide | Providers, combos, CLI integration, deployment |
| Setup Guide | Full install methods, CLI tool configs, protocol setup, timeout tuning |
| CLI Tools Guide | Per-tool setup for Claude Code, Codex, Cursor, Cline, OpenClaw, Kilo, Copilot |
| Quick Start | 3-step install → connect → configure |
| Document | Description |
|---|---|
| Docker Guide | Docker run, Compose profiles, Caddy HTTPS, tunnels, image tags |
| VM Deployment | Complete guide: VM + nginx + Cloudflare setup |
| Fly.io Deployment | Deploy to Fly.io with persistent storage |
| Termux Guide | Run OmniRoute on Android via Termux |
| PWA Guide | Progressive Web App install, caching, architecture |
| Uninstall Guide | Clean removal for all install methods |
| Environment Config | Complete .env variables and references |
| Document | Description |
|---|---|
| Architecture | System architecture, data flow, and internals |
| Compression Guide | 7-option pipeline: off / lite / standard / aggressive / ultra / RTK / stacked |
| RTK Compression | Command-output compression, filters, trust, verify, raw-output recovery |
| Compression Engines | Caveman, RTK, stacked pipelines, dashboard/API/MCP surfaces |
| Compression Rules Format | JSON rule-pack schemas for Caveman and RTK filters |
| Compression Language Packs | Language detection and Caveman rule-pack authoring |
| Resilience Guide | Circuit breakers, cooldowns, queue, anti-thundering herd, TLS spoofing |
| Auto-Combo Engine | 6-factor scoring, mode packs, self-healing |
| Proxy Guide | 3-level proxy system, 1proxy marketplace, registry CRUD |
| Free Tiers | 25+ free API providers consolidated directory |
| Features Gallery | Visual dashboard tour with screenshots |
| Codebase Documentation | Beginner-friendly codebase walkthrough |
| Document | Description |
|---|---|
| API Reference | All endpoints with examples |
| OpenAPI Spec | OpenAPI 3.0 specification |
| MCP Server | 29 MCP tools, IDE configs, Python/TS/Go clients |
| MCP Server Guide | MCP installation, transports, and tool reference |
| A2A Server | JSON-RPC 2.0 protocol, skills, streaming, task mgmt |
| A2A Server Guide | A2A agent card, tasks, skills, and streaming |
| Document | Description |
|---|---|
| Contributing | Development setup and guidelines |
| Security Policy | Vulnerability reporting and security practices |
| i18n Guide | 40+ language support, translation workflow, RTL |
| Release Checklist | Pre-release validation steps |
| Coverage Plan | Test coverage strategy and 4,690+ test suite |
OmniRoute is shaped by a passionate open-source community. These individuals have made exceptional contributions that directly impact the quality, stability, and reach of the project. Thank you.
![]() oyi77 🥇 190 commits • +72K lines Analytics engine, SQL aggregations, proxy marketplace, test coverage |
![]() Chris Staley 🥈 72 commits • +5.7K lines SSE stream hardening, Responses API, Gemini pagination, test regression fixes |
![]() zenobit 🥉 62 commits • +24K lines CI/CD pipeline, i18n for 33 languages, Void Linux package, platform fixes |
![]() R.D. & Randi 🏅 107 commits • +28K lines Endpoints page, tunnel integrations, Docker workflows, A2A status, compression UI |
![]() benzntech 🏅 20 commits • +7.5K lines Electron desktop app, auto-updater, release build workflows, cross-platform CI |
🙏 These contributors' features, bug fixes, and infrastructure improvements are a core part of what makes OmniRoute reliable and feature-rich. Every pull request, every test case, and every i18n translation file matters. Open source is built by people like them.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
# Create a release — npm publish happens automatically
gh release create v2.0.0 --title "v2.0.0" --generate-notesSpecial thanks to 9router by decolua — the original project that inspired this fork. OmniRoute builds upon that incredible foundation with additional features, multi-modal APIs, and a full TypeScript rewrite.
Special thanks to CLIProxyAPI by router-for-me — the original Go implementation that inspired this JavaScript port.
Special thanks to Caveman by JuliusBrussee (⭐ 51K+) — the viral "why use many token when few token do trick" project whose caveman-speak compression philosophy inspired OmniRoute's standard compression mode and 30+ filler/condensation regex rules.
Special thanks to RTK - Rust Token Killer by RTK AI — the high-performance command-output compression project whose terminal, build, test, git, and tool-output filtering model inspired OmniRoute's RTK engine, JSON filter DSL, raw-output recovery, and stacked RTK → Caveman compression pipeline.
MIT License - see LICENSE for details.
omniroute.online














