A drop-in proxy that sits between Claude Code and AI providers, giving you cost control, provider redundancy, and unified routing — without changing a line of your Claude Code workflow.
Claude Code communicates with the Anthropic Messages API. This gateway speaks the same protocol, so you point Claude Code at the gateway instead of directly at Anthropic. From there, the gateway can:
- Reduce costs — route lower-priority requests to cheaper providers (GPT-4o, Gemini) while keeping critical work on Claude
- Eliminate rate-limit disruptions — automatically fail over to a secondary provider when Anthropic returns 429s
- Cache repeated calls — response and token-count caches absorb duplicate requests at zero API cost
- Unify audit logging — every request flowing through Claude Code lands in one structured log, regardless of provider
- Enforce budgets — set per-client rate limits so a runaway agent doesn't burn your quota
Supported upstream providers: Anthropic, OpenAI, OpenRouter, Google Gemini, AWS Bedrock, Azure OpenAI, Ollama (local).
pip install -e .# Create a starter config
claude-code-model-gateway config init
# Add your providers — pick the ones you want
claude-code-model-gateway provider add anthropic # reads ANTHROPIC_API_KEY
claude-code-model-gateway provider add openai # reads OPENAI_API_KEY
claude-code-model-gateway provider add openrouter # reads OPENROUTER_API_KEY
claude-code-model-gateway provider add gemini # reads GOOGLE_API_KEY
# Set Anthropic as default, OpenRouter as fallback
claude-code-model-gateway provider set-default anthropic# All claude-* models go to Anthropic; everything else to OpenAI
claude-code-model-gateway route add "claude-*" --provider anthropic
claude-code-model-gateway route add "*" --provider openai# Listens on 127.0.0.1:8080 by default
claude-code-model-gateway gateway --response-cache --token-count-cacheSet the ANTHROPIC_BASE_URL environment variable before launching Claude Code:
export ANTHROPIC_BASE_URL=http://127.0.0.1:8080
claude # Claude Code now routes through the gatewayOr add it to your shell profile so it's always active:
echo 'export ANTHROPIC_BASE_URL=http://127.0.0.1:8080' >> ~/.zshrcClaude Code's requests arrive at the gateway in Anthropic API format. The gateway resolves the target provider from your routing rules, translates the request, forwards it, translates the response back, and returns it to Claude Code — all transparently.
claude-code-model-gateway route add "claude-opus-*" --provider anthropic
claude-code-model-gateway route add "claude-sonnet-*" --provider anthropic
claude-code-model-gateway route add "gpt-4o-mini" --provider openai
claude-code-model-gateway route add "*" --provider geminiOpenRouter aggregates 200+ models from Anthropic, OpenAI, Google, Meta, Mistral, and others behind a single OpenAI-compatible endpoint. Point the gateway at OpenRouter and let it handle the upstream routing:
# Add OpenRouter as a provider (set OPENROUTER_API_KEY first)
claude-code-model-gateway provider add openrouter
# Route cheap summarisation tasks to a free-tier model, everything else to Claude
claude-code-model-gateway route add "mistralai/mistral-7b-instruct:free" --provider openrouter
claude-code-model-gateway route add "*" --provider anthropicOr use OpenRouter exclusively and choose models by capability:
# All Claude Code requests go to OpenRouter; model selection happens there
claude-code-model-gateway provider add openrouter
claude-code-model-gateway provider set-default openrouter
claude-code-model-gateway gateway --response-cacheThen point Claude Code at the gateway:
export ANTHROPIC_BASE_URL=http://127.0.0.1:8080
export OPENROUTER_API_KEY=sk-or-...
claude # Claude Code → gateway → OpenRouter → the model you choseOpenRouter uses an OpenAI-compatible API, so the gateway's OpenAI translator handles all request/response conversion automatically.
The gateway retries on 429/502/503/504 before surfacing an error:
claude-code-model-gateway gateway --max-retries 3 --retry-delay 1.0claude-code-model-gateway gateway \
--response-cache --cache-ttl 600 --cache-maxsize 512 \
--token-count-cacheclaude-code-model-gateway gateway \
--log-format json --log-file /var/log/claude-gateway.log --verbose# Install as a systemd / launchd / OpenRC service
sudo bash scripts/install.sh
# Or run in Docker
docker-compose up -d| Variable | Required for | Description |
|---|---|---|
ANTHROPIC_BASE_URL |
Claude Code | Point Claude Code at the gateway (http://127.0.0.1:8080) |
ANTHROPIC_API_KEY |
Anthropic provider | Anthropic API key |
OPENAI_API_KEY |
OpenAI / Azure provider | OpenAI API key |
OPENROUTER_API_KEY |
OpenRouter provider | OpenRouter API key (get one at openrouter.ai) |
GOOGLE_API_KEY |
Gemini provider | Google AI API key |
AWS_ACCESS_KEY_ID |
Bedrock provider | AWS access key |
AWS_SECRET_ACCESS_KEY |
Bedrock provider | AWS secret key |
AWS_DEFAULT_REGION |
Bedrock provider | AWS region (e.g. us-east-1) |
GATEWAY_DEFAULT_PROVIDER |
Gateway runtime | Override default provider |
GATEWAY_LOG_LEVEL |
Gateway runtime | Override log level |
GATEWAY_TIMEOUT |
Gateway runtime | Override request timeout (seconds) |
Run claude-code-model-gateway config env-check to verify all required variables are set for your configured providers.
default_provider: anthropic
log_level: info
timeout: 120
max_retries: 3
providers:
anthropic:
name: anthropic
api_base: https://api.anthropic.com/v1
api_key_env_var: ANTHROPIC_API_KEY
enabled: true
openai:
name: openai
api_base: https://api.openai.com/v1
api_key_env_var: OPENAI_API_KEY
enabled: true
openrouter:
name: openrouter
api_base: https://openrouter.ai/api/v1
api_key_env_var: OPENROUTER_API_KEY
enabled: true
google:
name: google
api_base: https://generativelanguage.googleapis.com/v1beta
api_key_env_var: GOOGLE_API_KEY
enabled: truePass the config file to any command with --config gateway.yaml.
Starts an HTTP server that accepts Anthropic Messages API requests and routes them to the configured upstream providers.
# Start on default host/port (127.0.0.1:8080)
claude-code-model-gateway gateway
# Custom host and port
claude-code-model-gateway gateway --host 0.0.0.0 --port 9000
# With API key and pass-through mode (forwards all /v1/* paths)
claude-code-model-gateway gateway --api-key sk-ant-... --passthrough-mode
# Enable response caching (5-minute TTL, 256 entries)
claude-code-model-gateway gateway --response-cache --cache-ttl 300 --cache-maxsize 256
# Enable token count caching (1-hour TTL)
claude-code-model-gateway gateway --token-count-cache
# With connection pooling and retries
claude-code-model-gateway gateway --pool-size 20 --max-retries 3 --retry-delay 0.5
# With verbose JSON logging to a file
claude-code-model-gateway gateway --verbose --log-format json --log-file gateway.logKey options:
| Option | Default | Description |
|---|---|---|
--host / -H |
127.0.0.1 |
Bind address |
--port / -p |
8080 |
Listen port |
--timeout / -t |
300 |
Upstream timeout (seconds) |
--api-key |
$ANTHROPIC_API_KEY |
Anthropic API key |
--anthropic-version |
2023-06-01 |
API version header |
--passthrough-mode |
off | Forward all /v1/* paths |
--max-request-size |
10485760 |
Max request body (bytes) |
--response-cache |
off | Cache GET/HEAD responses |
--cache-ttl |
300 |
Response cache TTL (seconds) |
--token-count-cache |
off | Cache token count responses |
--pool-size |
10 |
Connection pool size |
--max-retries |
2 |
Retry count (non-streaming) |
--log-format |
standard |
standard, detailed, json, colored, minimal |
Internal endpoints (answered by the gateway itself, not forwarded):
GET /health— returns{"status": "ok"}GET /status— returns gateway statistics and provider counts
Starts a simple HTTP forward proxy that relays requests to upstream servers with optional retry logic.
# Start on default host/port (127.0.0.1:3000)
claude-code-model-gateway proxy
# Custom host and port with retries
claude-code-model-gateway proxy --host 0.0.0.0 --port 8888 --max-retries 3 --retry-delay 0.5claude-code-model-gateway config init # Create a default config file
claude-code-model-gateway config show # Display current config
claude-code-model-gateway config validate # Validate config file
claude-code-model-gateway config set <key> <value> # Set a top-level config value
claude-code-model-gateway config schema # Show full config schema
claude-code-model-gateway config schema --field providers # Show schema for a specific field
claude-code-model-gateway config lint # Lint config for issues
claude-code-model-gateway config doctor # Run comprehensive health check
claude-code-model-gateway config export --format json # Export config to JSON
claude-code-model-gateway config env-check # Check required environment variablesclaude-code-model-gateway provider list # List configured providers
claude-code-model-gateway provider list --builtins # Show built-in provider templates
claude-code-model-gateway provider show anthropic # Show provider details
claude-code-model-gateway provider add openai # Add a provider from built-in template
claude-code-model-gateway provider remove openai # Remove a provider
claude-code-model-gateway provider set-default anthropic # Set default provider
claude-code-model-gateway provider enable anthropic # Enable a provider
claude-code-model-gateway provider disable openai # Disable a provider
claude-code-model-gateway provider update anthropic --priority 1 # Update provider settingsSupported built-in providers: anthropic, openai, azure, openrouter, google (alias: gemini), bedrock, local (Ollama).
claude-code-model-gateway route list # List all routing rules
claude-code-model-gateway route list --format json
claude-code-model-gateway route resolve --model claude-sonnet-4-20250514 # Resolve a model to a provider
claude-code-model-gateway route add "claude-*" --provider anthropic # Add a routing rule
claude-code-model-gateway route remove "claude-*" # Remove a rule
claude-code-model-gateway route test --model gpt-4o # Test routing
claude-code-model-gateway route stats # Show routing statistics
claude-code-model-gateway route serve # Start route-aware gatewayclaude-code-model-gateway cache stats # Show cache hit/miss statistics
claude-code-model-gateway cache stats --format json # Output as JSON
claude-code-model-gateway cache clear # Clear all caches
claude-code-model-gateway cache purge # Purge expired cache entries
claude-code-model-gateway cache warmup # Pre-warm the cache
claude-code-model-gateway cache info # Show cache configuration
claude-code-model-gateway cache response-stats # Response cache statistics
claude-code-model-gateway cache token-count-stats # Token count cache statisticsclaude-code-model-gateway health status # Overall system health
claude-code-model-gateway health status --format json
claude-code-model-gateway health errors # Recent error log
claude-code-model-gateway health categories # Errors grouped by category
claude-code-model-gateway health circuit-breakers # Circuit breaker states
claude-code-model-gateway health reset # Reset error counters
claude-code-model-gateway health retry-policies # Show retry policy configurationclaude-code-model-gateway logging status # Show logging configuration
claude-code-model-gateway logging status --format json
claude-code-model-gateway logging test # Emit a test log message
claude-code-model-gateway logging formats # List available log formats
claude-code-model-gateway logging levels # List available log levels
claude-code-model-gateway logging metrics # Show logging metrics
claude-code-model-gateway logging health # Check logging system health
claude-code-model-gateway logging test-redaction # Test log redaction of sensitive data
claude-code-model-gateway logging set-level debug # Change log level at runtime
claude-code-model-gateway logging configure --format json --level info # Reconfigure logging
claude-code-model-gateway logging files # List active log files# Validate a config file directly
claude-code-model-gateway validate-config --config gateway.yaml
# Test a config (dry run, validate + connectivity checks)
claude-code-model-gateway test-config --config gateway.yaml
# Diff two config files
claude-code-model-gateway config-diff config-a.yaml config-b.yaml# Install in development mode (recommended for local use)
pip install -e .
# Install with development/test dependencies
pip install -e ".[dev]"
# Or install runtime dependencies only
pip install -r requirements.txtAlready covered in Quick Start? Steps 1–4 of Quick Start with Claude Code walk through a complete first-time install. This section is a condensed reference.
# Build and start with Docker Compose
docker-compose up
# Or build the image directly
docker build -t claude-code-model-gateway .
docker run -e ANTHROPIC_API_KEY=sk-ant-... -p 8080:8080 claude-code-model-gateway gateway --host 0.0.0.0Pre-built service files are included for the four most common init systems:
| Init system | Platform | File location |
|---|---|---|
| systemd | Linux (Debian, Ubuntu, Fedora, Arch…) | service/systemd/ |
| init.d | Linux (SysV / older distros) | service/initd/ |
| OpenRC | Linux (Alpine, Gentoo…) | service/openrc/ |
| launchd | macOS | service/launchd/ |
# Detect the init system and install automatically
sudo bash scripts/install.sh
# Uninstall
sudo bash scripts/uninstall.shVerify the environment variable is exported in the same shell where you launch Claude Code:
echo $ANTHROPIC_BASE_URL # should print http://127.0.0.1:8080
curl http://127.0.0.1:8080/health # should return {"status":"ok"}If ANTHROPIC_BASE_URL is empty, re-export it or add it to your shell profile (see Quick Start step 5).
The gateway process is not running. Start it with:
claude-code-model-gateway gatewayOr check that the port is not already in use:
lsof -i :8080The API key for the target provider is missing or incorrect. Verify it is set:
claude-code-model-gateway config env-checkCaching must be explicitly enabled when starting the gateway:
claude-code-model-gateway gateway --response-cache --token-count-cacheCheck cache hit rates with claude-code-model-gateway cache stats.
Increase the retry count and ensure a fallback provider is configured:
claude-code-model-gateway gateway --max-retries 5 --retry-delay 2.0
claude-code-model-gateway route add "*" --provider openai # fallbackBy default the gateway looks for gateway.yaml in the current directory. Pass an explicit path with --config:
claude-code-model-gateway gateway --config /path/to/gateway.yamlOr generate a starter config in the current directory:
claude-code-model-gateway config init# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=src
# Format code
black src tests
# Lint code
ruff check src tests
# Auto-fix lint issues
ruff check --fix src testsclaude-code-model-gateway/
├── src/
│ ├── __init__.py # Package init with version
│ ├── main.py # Entry point
│ ├── cli.py # All CLI commands
│ ├── gateway.py # Multi-provider routing gateway
│ ├── proxy.py # HTTP forward proxy
│ ├── anthropic_passthrough.py # Anthropic API pass-through
│ ├── router.py # Request routing engine
│ ├── interceptor.py # Request/response interception
│ ├── providers.py # Built-in provider registry
│ ├── models.py # Data models (ProviderConfig, etc.)
│ ├── cache.py # In-memory caching layer
│ ├── response_cache.py # Response caching
│ ├── token_count_cache.py # Token count caching
│ ├── retry.py # Retry logic with backoff
│ ├── retry_budget.py # Per-client retry budgets
│ ├── error_handling.py # Error handling & health tracking
│ ├── error_recovery_strategies.py # Recovery strategies
│ ├── errors.py # Custom exceptions
│ ├── logging_config.py # Structured logging configuration
│ ├── service.py # Daemon/service entry point
│ ├── config/ # Configuration management
│ │ ├── loader.py # YAML config loader
│ │ ├── manager.py # Config state manager
│ │ ├── validator.py # Config validator
│ │ ├── schema.py # Config schema definitions
│ │ └── testing.py # Test helpers
│ ├── translators/ # Per-provider API translators
│ │ ├── base.py # BaseTranslator ABC
│ │ ├── registry.py # Translator registry / singleton
│ │ ├── types.py # Canonical request/response types
│ │ ├── anthropic.py # Anthropic Messages API translator
│ │ ├── openai.py # OpenAI / Azure / OpenRouter translator
│ │ ├── gemini.py # Google Gemini translator
│ │ └── bedrock.py # AWS Bedrock translator
│ └── validation/ # Request validation utilities
│ ├── validator.py
│ └── testing.py
├── tests/ # pytest test suite
├── service/ # Service files (systemd, initd, openrc, launchd)
├── scripts/ # Install/update/healthcheck scripts
├── examples/ # Usage examples
├── Dockerfile
├── docker-compose.yaml
├── pyproject.toml
├── requirements.txt
└── README.md