FastAPI backend that proxies streaming chat completions from OpenRouter with integrated Model Context Protocol (MCP) tools. Responses stream over Server-Sent Events (SSE) for real-time rendering.
- Streaming chat via OpenRouter API with HTTP keep-alive for fast response times
- Configurable LLM planning — Optional AI-driven tool selection for optimized context
- Persistent chat history with GCS-backed attachment storage and automatic cleanup
- MCP tool aggregation — Google Calendar, Gmail, Drive, PDF extraction, Monarch Money, and custom utilities
- Speech-to-text — Mint short-lived Deepgram tokens for browser-based voice input
- Presets — Save and restore complete chat configurations (model, tools, prompts)
- OAuth integrations — Google, Monarch Money, and Spotify authentication flows
- Python 3.13+
- uv for dependency and environment management
- Google Cloud credentials for attachment storage (service account JSON)
- Environment variables in
.env(see Setup below)
-
Install uv if you haven't already:
curl -LsSf https://astral.sh/uv/install.sh | sh -
Clone and setup:
uv sync # Creates .venv/ and installs all dependencies -
Configure environment — create
.envin project root:# Required OPENROUTER_API_KEY=sk-or-v1-... # Google Cloud Storage (for attachments) GCS_BUCKET_NAME=your-bucket-name GCP_PROJECT_ID=your-project-id GOOGLE_APPLICATION_CREDENTIALS=credentials/googlecloud/sa.json # Optional defaults OPENROUTER_DEFAULT_MODEL=openai/gpt-4 OPENROUTER_SYSTEM_PROMPT="You are a helpful assistant." ATTACHMENTS_MAX_SIZE_BYTES=10485760 # 10MB ATTACHMENTS_RETENTION_DAYS=7 # Frontend URL (for OAuth redirects) FRONTEND_URL=http://localhost:5173 GOOGLE_OAUTH_REDIRECT_URI=http://localhost:8000/api/google-auth/callback # Optional: Deepgram for speech-to-text DEEPGRAM_API_KEY=... DEEPGRAM_TOKEN_TTL=30
-
Add Google credentials:
- Place service account JSON at
credentials/googlecloud/sa.json - For OAuth flows, add client credentials to
credentials/
- Place service account JSON at
-
Run the server:
uv run uvicorn backend.app:create_app --factory --reload --app-dir src
Alternative commands:
uv run backend # CLI wrapper with defaults (host 0.0.0.0, port 8000)
| Method & Path | Description |
|---|---|
GET /health |
Health check with active model info |
POST /api/chat/stream |
Stream chat completions via SSE |
GET /api/chat/test-stream |
Test SSE stream for debugging |
GET /api/chat/generation/{id} |
Get generation usage/cost details |
DELETE /api/chat/session/{id} |
Clear conversation history |
DELETE /api/chat/session/{id}/messages/{msg_id} |
Delete a single message |
| Method & Path | Description |
|---|---|
GET /api/models |
List available OpenRouter models |
GET /api/models/metadata |
Get model filtering metadata/facets |
GET /api/settings/model |
Get current model settings |
PUT /api/settings/model |
Update model configuration |
GET /api/settings/model/active-provider |
Get active provider info |
GET /api/settings/system-prompt |
Get system prompt |
PUT /api/settings/system-prompt |
Update system prompt |
| Method & Path | Description |
|---|---|
POST /api/uploads |
Upload attachment, returns signed GCS URL |
| Method & Path | Description |
|---|---|
GET /api/mcp/servers |
List MCP server configurations |
PUT /api/mcp/servers |
Replace all MCP server configs |
PATCH /api/mcp/servers/{id} |
Update a single MCP server |
POST /api/mcp/servers/refresh |
Hot-reload MCP tools |
| Method & Path | Description |
|---|---|
GET /api/presets/ |
List saved presets |
GET /api/presets/default |
Get the default preset |
GET /api/presets/{name} |
Get a specific preset |
POST /api/presets/ |
Create new preset from current state |
PUT /api/presets/{name} |
Save snapshot to existing preset |
DELETE /api/presets/{name} |
Delete a preset |
POST /api/presets/{name}/set-default |
Mark preset as default |
POST /api/presets/{name}/apply |
Apply saved preset |
| Method & Path | Description |
|---|---|
GET /api/suggestions |
Get quick prompt suggestions |
POST /api/suggestions |
Add a new suggestion |
PUT /api/suggestions |
Replace all suggestions |
DELETE /api/suggestions/{index} |
Delete a suggestion by index |
| Method & Path | Description |
|---|---|
POST /api/stt/deepgram/token |
Mint browser STT token |
| Method & Path | Description |
|---|---|
GET /api/google-auth/status |
Check Google OAuth status |
POST /api/google-auth/authorize |
Start Google OAuth flow |
GET /api/google-auth/callback |
Google OAuth callback |
GET /api/monarch-auth/status |
Check Monarch Money auth status |
POST /api/monarch-auth/login |
Login to Monarch Money |
GET /api/spotify-auth/status |
Check Spotify auth status |
POST /api/spotify-auth/authorize |
Start Spotify OAuth flow |
GET /api/spotify-auth/callback |
Spotify OAuth callback |
curl -N \
-H "Content-Type: application/json" \
-X POST \
-d '{
"model": "openrouter/auto",
"messages": [{"role": "user", "content": "Hello!"}]
}' \
http://localhost:8000/api/chat/streamsrc/backend/
__init__.py # Package exports create_app
app.py # FastAPI factory and lifespan
main.py # CLI entrypoint (uvicorn wrapper)
config.py # Settings via Pydantic
repository.py # SQLite data layer
openrouter.py # OpenRouter API client
logging_handlers.py # Custom log handlers
logging_settings.py # Log configuration parser
routers/ # API endpoint modules
schemas/ # Pydantic request/response models
services/ # Business logic layer
tasks/ # Background task definitions
utils/ # Shared utilities
chat/
orchestrator.py # Main chat coordination
mcp_client.py # MCP client wrapper
mcp_registry.py # MCP tool aggregator
tool_utils.py # Tool handling utilities
streaming/ # Streaming pipeline modules
mcp_servers/ # Bundled MCP integrations
data/ # Runtime state (gitignored)
chat_sessions.db # SQLite database
model_settings.json # Active model config
presets.json # Saved presets
suggestions.json # Quick prompt suggestions
mcp_servers.json # MCP configurations
tokens/ # OAuth tokens
tests/ # Pytest suite
frontend/ # Svelte + TypeScript UI
The Svelte UI is in frontend/ and proxies API requests during development:
cd frontend
npm install
npm run dev # Defaults to http://localhost:5173Configure backend URL in frontend/.env:
VITE_API_BASE_URL=http://localhost:8000All uploads (images, PDFs) are stored in private Google Cloud Storage:
- Metadata stored in SQLite
- Signed URLs with configurable expiration (default 7 days)
- Automatic URL refresh when serving chat history
- Background cleanup job removes expired attachments
Supported types: image/png, image/jpeg, image/webp, image/gif, application/pdf
MCP servers are configured in data/mcp_servers.json and hot-reloaded via API. Built-in integrations:
- Google Calendar — create/search events
- Gmail — read/send messages, manage drafts
- Google Drive — search/read/create files
- PDF tools — extract text and metadata
- Monarch Money — personal finance data and transactions
- Calculator & utilities — housekeeping helpers
- The canonical list of bundled servers lives in
src/backend/mcp_servers/__init__.py(BUILTIN_MCP_SERVER_DEFINITIONS). The FastAPI factory consumes that list to generate default entries with the same enable/disable defaults, so updating the list keeps both the module exports and fallback config in sync.
Each server's tools are prefixed (e.g., custom-gmail__gmail_create_draft) to avoid naming conflicts.
Save complete configurations (model, tools, prompt, parameters) and restore them later. Presets capture:
- Active model ID
- Provider/parameter overrides
- System prompt
- Enabled MCP servers
- Model filter settings
Use the UI or the /api/presets/ endpoints. You can set a default preset to auto-load on startup.
uv run pytest # Run all tests
uv run pytest tests/test_attachments.py # Specific test file
uv run pytest -v # Verbose outputTests use isolated SQLite databases in tests/data/ and clean up automatically.
- REFERENCE.md — Operations guide, system details, troubleshooting
- GCS_STORAGE.md — GCS attachment storage implementation
- Code style: PEP 8, type hints required,
rufffor linting - Async first: Use
async/awaitfor all I/O operations - Error handling: Fail fast with clear errors, catch broad exceptions only at boundaries
- Tests:
pytest+pytest-asyncio, one test file per module - Dependencies: Manage via
uv, sync withuv sync - Secrets: Never commit credentials, use
.envonly
See .github/copilot-instructions.md for AI agent guidelines.
Import errors after adding dependencies:
uv sync # Regenerate .venvAttachments not uploading:
- Verify GCS bucket exists and service account has
storage.objects.createpermission - Check
GOOGLE_APPLICATION_CREDENTIALSpoints to valid JSON
MCP tools not appearing:
- Check
data/mcp_servers.jsonhas enabled servers - Verify required env vars (e.g., Google OAuth credentials) are set
- Use
POST /api/mcp/servers/refreshto hot-reload
Tests failing:
- Run
uv syncto ensure dependencies are current - Check
tests/data/for stale SQLite files (usually auto-cleaned)
See LICENSE file for details.