Make your AI agents and scripts speak. One command, six TTS providers, zero lock-in.
Hear narrate speak in three commands — no API keys, no signup:
brew install felores/narrate/narrate
brew services start narrate
narrate "Hello, narrate"That's it. Uses your built-in macOS voice. Want studio-quality voices? Add an API key — it's optional.
Linux?
curl -fsSL https://raw.githubusercontent.com/felores/narrate/main/install.sh | bash, thensudo apt install espeak-ng, thennarrate-server &andnarrate "hello".
Table of contents
- Why narrate
- Add an API key — for premium voices
- Use it from your AI tool — Claude Code, Cursor, OpenCode, etc.
- Providers
- Install — other methods
- Where things live
- Configure
- Quickstart by interface
- Provider setup detail
- Voicebox deep dive — local voice cloning
- voices.json — voice presets
- CLI reference
- HTTP API reference
- MCP tools reference
- Configuration precedence
- Run as a service
- Logging and observability
- Architecture
- Project layout
- narrate vs voicebox
- Roadmap
- Troubleshooting
- Contributing
- License
Optional. The default macOS voice works fine for notifications, but premium providers sound dramatically better. Pick one (or several):
| Provider | Where to get the key | Cost |
|---|---|---|
| ElevenLabs | elevenlabs.io | free tier, premium voices |
| OpenAI | platform.openai.com/api-keys | pay-per-use, very cheap |
| Google Gemini | aistudio.google.com/apikey | free tier |
| xAI | console.x.ai | pay-per-use |
Then add the key(s) to ~/.env and switch the default provider:
echo 'OPENAI_API_KEY=sk-...' >> ~/.env # any subset works
echo 'ELEVENLABS_API_KEY=...' >> ~/.env
mkdir -p ~/.config/narrate
echo '{"default_provider":"openai","default_voice":"nova"}' > ~/.config/narrate/config.json
brew services restart narrate
narrate "Now I sound much better"narrate verify shows you which providers are configured. See Provider setup detail for per-provider voice IDs.
Why
~/.env, not~/.zshrc? Background services (brew services, LaunchAgent, systemd) don't run shell init.~/.envis the only path that works for both CLI and the server-as-service.
Every coding harness reinvents voice. ElevenLabs has a UI, OpenAI has an API, Cartesia has another API, Voicebox has its own MCP server — and each agent (Claude Code, OpenCode, Pi, Cursor, Cline) has its own way of plugging in. The result: shell scripts that hardcode one provider, hooks that break when you change agents, no shared concept of "voice".
narrate collapses the matrix:
- One server, one set of API keys, one set of voice presets.
- Three interfaces: HTTP for anything, CLI for shells, MCP for agents that speak the protocol.
- Six providers behind a uniform
Providerinterface — including a proxy to Voicebox for fully local voice cloning. - Voice presets that abstract over providers (
narrate --voice researcherworks whetherresearcheris OpenAI Nova or Voicebox Morgan). - Drop into any harness: hook scripts, plugins, MCP — pick whichever your tool supports.
| Provider | Type | Auth | Notes |
|---|---|---|---|
| ElevenLabs | Cloud | ELEVENLABS_API_KEY |
High quality, premium voices |
| OpenAI TTS | Cloud | OPENAI_API_KEY |
alloy, echo, fable, onyx, nova, shimmer |
| Google Gemini TTS | Cloud | GEMINI_API_KEY |
Multilingual, requires ffmpeg for PCM→WAV |
| xAI Grok TTS | Cloud | XAI_API_KEY |
eve, ara, rex, sal, leo |
| Voicebox | Local proxy | none | Auto-detects on :17493 — voice cloning, 7 local engines, 23 languages |
System (say / espeak) |
Local | none | Zero-dep fallback, works offline |
Add any subset. narrate uses what you've configured and reports the rest as ⚪ not configured in narrate verify.
brew install felores/narrate/narrate
brew services start narrate # auto-start at loginThat's everything. Bun is pulled in as a dependency. After this you can run narrate "hello" and you'll hear it.
Requires bun first (curl -fsSL https://bun.sh/install | bash).
curl -fsSL https://raw.githubusercontent.com/felores/narrate/main/install.sh -o /tmp/narrate-install.sh
bash /tmp/narrate-install.sh
"$HOME/.local/share/narrate/service/launchd/install.sh" # macOS
"$HOME/.local/share/narrate/service/systemd/install.sh" # LinuxClones to ~/.local/share/narrate, writes wrappers to ~/.local/bin/{narrate,narrate-server}, then installs the auto-start service. Override paths via NARRATE_DIR, BIN_DIR, NARRATE_REF.
git clone https://github.com/felores/narrate.git ~/Documents/GitHub/narrate
cd ~/Documents/GitHub/narrate
bun install
bun run src/server.ts &
bun run src/cli.ts verifyOnce installed, the repo + scripts are at one of these paths depending on the method you used:
| Install method | $NARRATE_DIR |
Logs |
|---|---|---|
| Homebrew | $(brew --prefix narrate)/libexec |
$NARRATE_DIR/logs/narrate.log |
| curl install | ~/.local/share/narrate |
$NARRATE_DIR/logs/narrate.log |
| git clone (dev) | wherever you cloned (e.g. ~/Documents/GitHub/narrate) |
$NARRATE_DIR/logs/narrate.log |
Set it once in your shell init so the recipes below work copy-paste:
# pick the line that matches how you installed
export NARRATE_DIR="$(brew --prefix narrate)/libexec" # brew
export NARRATE_DIR="$HOME/.local/share/narrate" # curl
export NARRATE_DIR="$HOME/Documents/GitHub/narrate" # git cloneThe running server reports its own location at GET /health (repo_dir, logs_dir) — useful for plugins and tooling that need to self-locate.
You can skip this entirely if the Add an API key section above covered your needs. This section is for named voice presets and per-provider tweaks.
Map a friendly name to a (provider, voice_id) triple so you can swap providers without touching agent code:
mkdir -p ~/.config/narrate
cp "$NARRATE_DIR/voices.json.example" ~/.config/narrate/voices.json
narrate --voice researcher "Findings ready" # uses the preset from voices.jsonEdit ~/.config/narrate/voices.json to add your own presets. Full schema in voices.json — voice presets.
cat > ~/.config/narrate/config.json <<EOF
{
"default_provider": "openai",
"default_voice": "researcher",
"port": 8888
}
EOF
brew services restart narrateSee Configuration precedence for the full resolution chain.
narrate exposes three interfaces. Pick whichever your tool supports.
Best for shells, hooks, scripts, cron, terminal one-offs.
narrate "Build complete"
narrate --voice engineer "Tests passed"
narrate --provider system --id Samantha "Local fallback"
echo "Long output" | narrate --quiet
narrate verify # doctor-style health snapshot
narrate verify --test # also play one sample per configured provider (1 API call each)Best for plugin code, webhooks, anything that can fetch.
curl -X POST http://localhost:8888/notify \
-H 'Content-Type: application/json' \
-H 'X-Narrate-Client-Id: my-app' \
-d '{"message":"Build green","voice":"engineer"}'Best for AI agents with native tool calling. The agent itself decides when to speak.
# Claude Code one-liner
claude mcp add narrate \
--transport http \
--url http://localhost:8888/mcp \
--header "X-Narrate-Client-Id: claude-code"Or via .mcp.json in any HTTP MCP client (Cursor, Windsurf, VS Code, Cline):
{
"mcpServers": {
"narrate": {
"url": "http://localhost:8888/mcp",
"headers": { "X-Narrate-Client-Id": "cursor" }
}
}
}The agent now sees narrate.speak, narrate.list_voices, and narrate.list_providers as tools.
Per-harness recipes live under integrations/. Summary:
| Harness | Method | Recipe |
|---|---|---|
| Claude Code | MCP (recommended) or Stop hook | integrations/claude-code/ |
| Cursor / Windsurf / Cline | MCP | integrations/cursor/ |
| OpenCode | Plugin (@opencode-ai/plugin) |
integrations/opencode/ |
| Pi (pi-mono) | agent.subscribe('turn_end') |
integrations/pi/ |
| ChatGPT Codex CLI | Wrapper script | integrations/codex/ |
| Shell scripts / cron / CI | Direct CLI | integrations/shell/ |
- Sign up at elevenlabs.io → API Keys → create a key.
echo 'ELEVENLABS_API_KEY=your_key' >> ~/.env- Voice IDs: find them at elevenlabs.io/voice-lab (each voice's URL ends in its ID).
- Add to
voices.json:"rachel": { "provider": "elevenlabs", "voice_id": "21m00Tcm4TlvDq8ikWAM" }
- Get a key at platform.openai.com/api-keys.
echo 'OPENAI_API_KEY=sk-...' >> ~/.env- Six built-in voices (no IDs to look up):
alloy,echo,fable,onyx,nova,shimmer. - Optional providerConfig:
{ "model": "tts-1-hd", "speed": 1.2 }for higher quality / faster speech."narrator": { "provider": "openai", "voice_id": "fable", "providerConfig": { "model": "tts-1-hd" } }
- Get a key at aistudio.google.com/apikey.
echo 'GEMINI_API_KEY=...' >> ~/.env- Install
ffmpeg(Gemini returns raw PCM that we convert to WAV):brew install ffmpeg # macOS sudo apt install ffmpeg # Linux
- Voice names:
Kore,Puck,Charon,Fenrir,Aoede(and others — see Gemini docs).
- Get a key at console.x.ai.
echo 'XAI_API_KEY=...' >> ~/.env- Voice IDs:
eve,ara,rex,sal,leo. - Optional:
XAI_LANGUAGE=auto(default),XAI_VOICE_ID=araset as default voice.
See Voicebox deep dive. TLDR:
"$NARRATE_DIR/examples/voicebox-install-macos.sh"
open /Applications/Voicebox.app
# wait for Kokoro model download via Settings → Engines (or another engine)
"$NARRATE_DIR/examples/voicebox-create-profile.sh" # creates "Bella" profile
narrate --provider voicebox --id Bella "Local voice"Zero config on macOS — say is built in. On Linux, install espeak-ng:
sudo apt install espeak-ng # Debian/Ubuntu
sudo dnf install espeak-ng # FedoraVoice names: any voice your system speaks. On macOS:
say -v '?' # list all installed voices
narrate --provider system --id Samantha "macOS Samantha"
narrate --provider system --id "Daniel" "British Daniel"Voicebox is a local-first desktop app that runs TTS engines on your GPU. narrate uses it as a provider — your agent calls narrate.speak, narrate proxies to voicebox, voicebox plays the audio.
"$NARRATE_DIR/examples/voicebox-install-macos.sh"(Or download manually from voicebox.sh and drag to /Applications.)
Voicebox has two concepts:
- Engine = the underlying TTS model (Kokoro, Qwen, Chatterbox, TADA, LuxTTS). Each engine ships preset voices.
- Profile = a usable voice instance, either created from a preset or cloned from audio.
/speak only accepts profile names — preset voices have to be promoted to profiles first. Do it via UI, or with the helper:
"$NARRATE_DIR/examples/voicebox-create-profile.sh" # creates "Bella" from kokoro/af_bella
"$NARRATE_DIR/examples/voicebox-create-profile.sh" Adam kokoro am_adam en
"$NARRATE_DIR/examples/voicebox-create-profile.sh" Dora kokoro ef_dora es
"$NARRATE_DIR/examples/voicebox-create-profile.sh" George kokoro bm_george enKokoro voices are flexible: the same profile can speak any of Kokoro's 8 languages depending on what language you pass to /speak. Voices are style vectors at the model level — they describe a timbre, not a language. Pointing them at a different language is supported.
- A
kokoro/ef_dora-backed profile created withlanguage: "es"speaks natural Spanish. - The same Dora profile asked to speak
language: "en"speaks English with a Spanish accent (her trained timbre + English phonetics). - A
kokoro/af_bella-backed profile (en-trained) asked to speaklanguage: "es"speaks Spanish with Bella's American voice timbre but proper Spanish phonetics — this is the way to make Bella speak Spanish naturally. - narrate's voicebox provider resolves
profile.languageautomatically (cached 60s) as the default. Override per-call with--language es(CLI),providerConfig.language: "es"(POST body or voices.json), or pin a preset:
"bella_es": {
"provider": "voicebox",
"voice_id": "Bella",
"providerConfig": { "language": "es" }
}50 presets total. Some highlights:
| Preset | Name | Language / accent |
|---|---|---|
af_bella, af_nova, af_sky, af_nicole |
various | en-female (US) |
am_adam, am_onyx, am_echo |
Adam, Onyx, Echo | en-male (US) |
bf_emma, bf_alice |
Emma, Alice | en-female (UK) |
bm_george, bm_daniel |
George, Daniel | en-male (UK) |
ef_dora, em_alex |
Dora, Alex | es female / male |
ff_siwis |
Siwis | fr female |
hf_alpha, hm_omega |
various | hi female / male |
jf_alpha, jm_kumo |
various | ja female / male |
zf_xiaoxiao, others |
various | zh female |
Full list: curl http://127.0.0.1:17493/profiles/presets/kokoro.
Map a friendly name to a (provider, voice_id, options) triple so you can swap providers without touching agent code.
{
"default_voice": "fred",
"default_rate": 175,
"voices": {
"fred": { "provider": "elevenlabs", "voice_id": "s3TPKV1kjDlVtZbl4Ksh" },
"researcher":{ "provider": "openai", "voice_id": "nova" },
"engineer": { "provider": "openai", "voice_id": "alloy" },
"narrator": { "provider": "openai", "voice_id": "fable",
"providerConfig": { "model": "tts-1-hd" } },
"ara": { "provider": "xai", "voice_id": "ara" },
"kore": { "provider": "gemini", "voice_id": "Kore" },
"bella": { "provider": "voicebox", "voice_id": "Bella" },
"dora": { "provider": "voicebox", "voice_id": "Dora" },
"samantha": { "provider": "system", "voice_id": "Samantha" }
}
}Use it with the preset name: narrate --voice dora "Hola".
If your voices.json only has voice_name per entry (no provider field), narrate auto-assumes provider: "system" (the v1 schema was for macOS say). You'll see a one-line warning at startup.
Each provider accepts extra options under providerConfig:
| Provider | Useful keys |
|---|---|
| ElevenLabs | model_id, voice_settings: {stability, similarity_boost, style, use_speaker_boost} |
| OpenAI | model (tts-1 / tts-1-hd), speed (0.25–4.0) |
| Gemini | model |
| xAI | language, sample_rate, bit_rate, codec |
| Voicebox | language, instruct (Qwen CustomVoice natural-language delivery), personality (boolean), return_audio (use /generate instead of /speak) |
| System | rate |
narrate [options] "text to speak"
narrate verify [--test]
echo "text" | narrate [options]
Options:
-v, --voice NAME Voice preset from voices.json (e.g. fred, researcher)
-i, --id ID Raw provider voice id (bypasses preset registry)
-p, --provider NAME elevenlabs | openai | gemini | xai | voicebox | system
-l, --language LANG Force generation language (e.g. es, en, ja, fr).
Useful with cross-language voices: a Kokoro Bella
(en-trained) speaks proper Spanish phonetics with
--language es, since Kokoro is multilingual at the
model level.
--instruct TEXT Natural-language delivery hint (Qwen CustomVoice
only). E.g. "warm conversational tone",
"broadcast news quality", "speak slowly with
emphasis". Other engines ignore this flag.
-u, --url URL Server URL (default http://localhost:8888)
-q, --quiet Suppress output
-h, --help Show help
Subcommands:
verify Health snapshot — server status, provider matrix, voices
verify --test Also play one sample per configured provider (1 API call each)
Env:
NARRATE_URL Override default server URL
NARRATE_VOICE Default preset (fallback for omitted --voice)
--language and --instruct forward as providerConfig.{language,instruct} and override both preset providerConfig and the voicebox provider's auto-resolved profile defaults.
# Bella is en-trained, but Kokoro can aim her at Spanish phonetics:
narrate --provider voicebox --id Bella --language es "Hola, soy Bella en español"
# Qwen Ryan with delivery direction:
narrate --provider voicebox --id Ryan --instruct "broadcast news quality" "Headlines tonight"Speak text. Returns immediately; audio plays asynchronously.
Body:
| Field | Type | Required | Notes |
|---|---|---|---|
message |
string | yes | Up to 5000 chars, no control characters |
voice |
string | no | Preset name from voices.json |
voice_id |
string | no | Raw provider voice id (bypasses presets) |
voice_name |
string | no | Legacy alias for voice_id |
provider |
string | no | Override default provider |
voice_enabled |
boolean | no (default true) |
If false, returns {status: "ok", message: "voice_enabled=false; nothing to do"} |
providerConfig |
object | no | Per-provider passthrough config (see provider table above) |
Headers:
| Header | Purpose |
|---|---|
X-Narrate-Client-Id |
Client identifier (logged + future per-client routing) |
Response (200):
{ "status": "success", "provider": "openai", "voice": "alloy", "format": "mp3", "delegated": false }delegated: true means the provider played the audio itself (voicebox, system) and narrate skipped local playback.
Legacy alias for /notify (PAI Voice compatibility).
Server + provider snapshot.
{
"status": "healthy",
"port": 8888,
"default_provider": "xai",
"default_voice": "ara",
"voices_path": "/Users/you/.config/narrate/voices.json",
"voices": ["fred", "researcher", "engineer", ...],
"providers": {
"elevenlabs": { "configured": true },
"openai": { "configured": true },
"gemini": { "configured": true },
"xai": { "configured": true },
"voicebox": { "configured": true },
"system": { "configured": true }
}
}Full voices.json contents.
{
"default_voice": "fred",
"default_rate": 175,
"voices": { "fred": { ... }, "researcher": { ... } }
}MCP Streamable HTTP endpoint. JSON-RPC 2.0. See MCP tools reference.
Three tools available via the MCP server at /mcp:
narrate.speak({
text: string, // required, max 5000
voice?: string, // preset name from voices.json
voice_id?: string, // raw provider voice id
provider?: "elevenlabs" | "openai" | "gemini" | "xai" | "voicebox" | "system"
}) -> "Spoken via <provider> (voice=<voice>, format=<fmt>, delegated playback)"narrate.list_voices() -> Array<{ name, provider, voice_id, description }>Returns all voice presets from voices.json.
narrate.list_providers() -> Array<{ name, label, configured, reason? }>Returns the provider health matrix — same data as GET /health's providers field.
# tools/list
curl -X POST http://localhost:8888/mcp \
-H 'Content-Type: application/json' \
-H 'Accept: application/json, text/event-stream' \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'
# tools/call
curl -X POST http://localhost:8888/mcp \
-H 'Content-Type: application/json' \
-H 'Accept: application/json, text/event-stream' \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"speak","arguments":{"text":"Hello","voice":"researcher"}}}'Higher rows win. narrate reads each layer at startup; mid-flight changes need a server restart.
| # | Layer | Used for |
|---|---|---|
| 1 | CLI flags / POST body / MCP tool args | per-call provider, voice, providerConfig |
| 2 | ~/.config/narrate/config.json |
default_provider, default_voice, port, voices_path |
| 3 | NARRATE_* env vars |
NARRATE_PORT, NARRATE_PROVIDER, NARRATE_VOICE, NARRATE_VOICES_PATH, NARRATE_URL (CLI only) |
| 4 | ~/.claude/settings.json (legacy compat) |
TTS_PROVIDER and DA_VOICE_ID/NARRATE_VOICE_ID are read for backward-compat |
| 5 | ~/.env |
API keys (ELEVENLABS_API_KEY, etc.) auto-loaded if present |
| 6 | Built-in defaults | port: 8888, default_provider: "elevenlabs", default_rate: 175 |
API keys come from process.env (loaded from your shell or auto-loaded from ~/.env). Never put them in config.json or voices.json.
brew services start narrate # if installed via Homebrew
"$NARRATE_DIR/service/launchd/install.sh" # if installed via curl/gitThe installer:
- Renders
com.narrate.server.plistfrom a template ($HOMEand$NARRATE_DIRsubstituted at install time, with a staticPATHof<bun_dir>:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin). - Drops it at
~/Library/LaunchAgents/. - Loads it with
launchctl. - Verifies it's running.
To remove:
brew services stop narrate
"$NARRATE_DIR/service/launchd/uninstall.sh""$NARRATE_DIR/service/systemd/install.sh"Installs as a user service (~/.config/systemd/user/narrate.service) and runs systemctl --user enable --now.
To remove:
"$NARRATE_DIR/service/systemd/uninstall.sh"| File | What |
|---|---|
logs/narrate.log |
All requests, with timestamp, provider, voice, latency, client id |
logs/narrate-error.log |
Errors |
logs/launchd-stdout.log |
Pre-init startup output (small, only grows on crashes) |
logs/launchd-stderr.log |
Same for stderr |
# follow live request log (resolve the path via /health if you don't know it)
LOGS_DIR="$(curl -s localhost:8888/health | python3 -c 'import sys,json;print(json.load(sys.stdin)["logs_dir"])')"
tail -f "$LOGS_DIR/narrate.log"
# or if you set $NARRATE_DIR per "Where things live":
tail -f "$NARRATE_DIR/logs/narrate.log"
# example line
2026-04-27T23:44:36.733Z [/notify] → provider=voicebox voice=Dora bytes=42 from=localhost client=- ua=Bun/1.2.10
2026-04-27T23:44:36.755Z [/notify] ✅ 25ms provider=voicebox voice=Dora format=mp3 delegated=trueIn-process rotation. Defaults: 10 MiB per file, keep last 5 (narrate.log → narrate.log.1 → ... → narrate.log.5).
# tune via env (read once at server start)
NARRATE_LOG_MAX_BYTES=20971520 NARRATE_LOG_KEEP=10 narrate-server
# disable entirely (use raw stdout/stderr — useful for `bun run` dev mode)
NARRATE_LOG_DISABLED=1 narrate-servernarrate verify
narrate verify --test # also play 1 sample per configured providerPrints server health, default provider/voice, voices file path, preset list, and per-provider configured/reason status.
┌────────────────────────────────────────────────────────────┐
│ narrate (Bun process) │
│ │
│ HTTP server (port 8888) │
│ ├─ POST /notify POST /pai (legacy) │
│ ├─ GET /health GET /voices │
│ └─ POST /mcp (MCP Streamable HTTP) │
│ │
│ │ │
│ ▼ │
│ handleNotify() │
│ │ │
│ ▼ │
│ Provider registry (ALL_PROVIDERS) │
│ ┌──────────────┬──────────────┬────────────┐ │
│ │ ElevenLabs │ OpenAI │ Gemini │ cloud │
│ ├──────────────┼──────────────┼────────────┤ │
│ │ xAI │ Voicebox │ System │ cloud/local│
│ └──────────────┴──────────────┴────────────┘ │
│ │ │
│ ▼ │
│ ArrayBuffer (or delegated=true) │
│ │ │
│ ▼ │
│ playback.ts → afplay (macOS) / ffplay (Linux) │
└────────────────────────────────────────────────────────────┘
Each Provider (in src/providers/) implements a small interface:
interface Provider {
name: string;
label: string;
health(): Promise<ProviderHealth>;
generateSpeech(text: string, voice: string, opts?: ProviderOptions): Promise<AudioResult>;
listVoices?(): Promise<VoiceInfo[]>;
}Provider implementations talk to their respective APIs (or local services like voicebox :17493). The result is either an ArrayBuffer (cloud — narrate plays it locally via playback.ts) or delegated: true (voicebox, system — they handled playback themselves).
The MCP server is a thin wrapper: it registers narrate.speak, narrate.list_voices, narrate.list_providers as tools, and the speak tool calls the same handleNotify function as the HTTP handler. One code path, three interfaces.
narrate/
├── src/
│ ├── providers/
│ │ ├── base.ts # Provider interface, types
│ │ ├── elevenlabs.ts
│ │ ├── openai.ts
│ │ ├── gemini.ts
│ │ ├── xai.ts
│ │ ├── voicebox.ts
│ │ ├── system.ts
│ │ └── index.ts # registry
│ ├── voices.ts # voices.json loader (v1 → v2 compat)
│ ├── config.ts # XDG config + env vars + ~/.claude/settings.json shim
│ ├── playback.ts # afplay / ffplay
│ ├── logger.ts # rotating file logger
│ ├── mcp.ts # MCP server (Streamable HTTP)
│ ├── server.ts # HTTP server
│ └── cli.ts # narrate CLI
├── integrations/ # one folder per harness with real refs
│ ├── claude-code/
│ ├── opencode/
│ ├── pi/
│ ├── codex/
│ ├── cursor/
│ └── shell/
├── service/
│ ├── launchd/ # macOS install + plist template
│ └── systemd/ # Linux install + unit template
├── examples/
│ ├── config.example.json
│ ├── voicebox-install-macos.sh
│ └── voicebox-create-profile.sh
├── voices.json.example
├── install.sh # curl install entry point
├── package.json
├── tsconfig.json
├── README.md
├── CHANGELOG.md
├── LICENSE
└── .github/workflows/ # CI (TBD)
Voicebox is a full local-first TTS studio with on-device inference, voice cloning, dictation, MCP server, and 7 local engines. It's a desktop app.
narrate is a thin gateway. They compose — voicebox is one of narrate's providers.
| narrate | voicebox | |
|---|---|---|
| Form factor | CLI + HTTP server + MCP | Desktop app (Tauri) |
| Engines | Cloud + voicebox proxy + system | 7 local engines (MLX/CUDA) |
| Voice cloning | No (uses provider voices) | Yes (zero-shot) |
| Dictation (STT) | No | Yes (Whisper hotkey) |
| MCP server | Yes (/mcp) |
Yes (/mcp on :17493) |
| Footprint | < 1 MB + bun | GB of models |
| Best for | Drop into any agent or shell | Privacy-first studio workflows |
Use narrate when you want one command that any harness or shell can call, mixing cloud and local providers. Use voicebox when you want fully local, GPU-accelerated voice. Use both when you want voicebox's quality plus narrate's harness-agnostic gateway.
| Status | Item |
|---|---|
| ✅ v0.1.0 | 6 providers, CLI, HTTP server, voices.json v2, launchd + systemd |
| ✅ v0.2.0 | Per-request observability, narrate verify, real OpenCode + Pi integrations, voicebox install helper |
| ✅ v0.3.0 | MCP server (/mcp), curl install script, Homebrew tap, voicebox profile helper, multi-language fix |
| ✅ v0.3.1 | In-process log rotation |
| ✅ v0.3.2 | Voicebox instruct passthrough (Qwen natural-language delivery) |
| ✅ v0.3.3 | CLI --language and --instruct flags |
| ✅ v0.3.4 | SwiftBar / xbar menubar plugin |
| ✅ v0.3.5 | Portability fixes — /health exposes repo_dir/logs_dir, plugin auto-locates, SwiftBar Login Items autostart, plist drops $PATH snapshot |
| ✅ v0.3.6 | First-run UX: default provider is system so fresh installs work without API keys. README rewritten for non-technical users with a 3-command quickstart at the top. |
| Planned v0.4 | Pre-built single-binary releases (bun build --compile per platform) |
| Planned v0.5 | More providers (Cartesia, Hume EVI, Azure TTS) |
| Planned v0.6 | --direct CLI mode (skip server, call providers directly) |
| Planned v0.7 | Streaming TTS over WebSocket |
| Planned v0.8 | Auth tokens for /notify and /mcp (currently localhost-only) |
| Planned v1.0 | Test suite, GitHub Actions CI, npm publish |
- Cloud provider: API key env var not set.
cat ~/.env | grep <PROVIDER>_API_KEY. Restart the server after adding (brew services restart narrateor relaunch LaunchAgent). - Voicebox: app not running, or running on a non-default port. Open
/Applications/Voicebox.app. If on a different port, setVOICEBOX_URL=http://127.0.0.1:NNNNN. - System on Linux: install
espeak-ng.
The default provider is whatever ~/.claude/settings.json says (or default_provider in config.json). When you pass --id Samantha without --provider system, narrate uses the default provider — which doesn't know about Samantha. Either:
narrate --provider system --id Samantha "..."(explicit provider)narrate --voice samantha "..."(preset that bundles provider + voice id)
Solved in v0.3.0 (aede995): voicebox's /speak doesn't auto-pull language from the profile, it defaults to "en". narrate now resolves and passes profile.language automatically. If still wrong, force it via providerConfig.language:
"dora_es": {
"provider": "voicebox", "voice_id": "Dora",
"providerConfig": { "language": "es" }
}If you both brew install narrate AND ran the curl install, you have /opt/homebrew/bin/narrate and ~/.local/bin/narrate. Both work; PATH order decides which wins. Pick one and remove the other.
Tune rotation:
# in your shell init or LaunchAgent EnvironmentVariables
NARRATE_LOG_MAX_BYTES=2097152 # 2 MiB
NARRATE_LOG_KEEP=3Or disable entirely:
NARRATE_LOG_DISABLED=1Already fixed in v0.3.0 (a5aaa14). If you see this, your local install is pre-fix — pull main and reload.
git clone https://github.com/felores/narrate.git
cd narrate
bun install
bun run --watch src/server.ts # hot-reload dev mode
./node_modules/.bin/tsc --noEmit # typecheckTo add a new TTS provider:
- Create
src/providers/<name>.tsimplementing theProviderinterface fromsrc/providers/base.ts. - Register it in
src/providers/index.ts. - Add an integration test in
narrate verify --test(thesampleVoiceFormap). - Document it in this README's Provider setup detail.
PRs welcome. Issues: https://github.com/felores/narrate/issues
MIT — see LICENSE.