narrate

Make your AI agents and scripts speak. One command, six TTS providers, zero lock-in.

60-second quickstart (macOS)

Hear narrate speak in three commands — no API keys, no signup:

brew install felores/narrate/narrate
brew services start narrate
narrate "Hello, narrate"

That's it. Uses your built-in macOS voice. Want studio-quality voices? Add an API key — it's optional.

Linux? curl -fsSL https://raw.githubusercontent.com/felores/narrate/main/install.sh | bash, then sudo apt install espeak-ng, then narrate-server & and narrate "hello".

Table of contents

Why narrate
Add an API key — for premium voices
Use it from your AI tool — Claude Code, Cursor, OpenCode, etc.
Providers
Install — other methods
Where things live
Configure
Quickstart by interface
Provider setup detail
Voicebox deep dive — local voice cloning
voices.json — voice presets
CLI reference
HTTP API reference
MCP tools reference
Configuration precedence
Run as a service
Logging and observability
Architecture
Project layout
narrate vs voicebox
Roadmap
Troubleshooting
Contributing
License

Add an API key

Optional. The default macOS voice works fine for notifications, but premium providers sound dramatically better. Pick one (or several):

Provider	Where to get the key	Cost
ElevenLabs	elevenlabs.io	free tier, premium voices
OpenAI	platform.openai.com/api-keys	pay-per-use, very cheap
Google Gemini	aistudio.google.com/apikey	free tier
xAI	console.x.ai	pay-per-use

Then add the key(s) to ~/.env and switch the default provider:

echo 'OPENAI_API_KEY=sk-...' >> ~/.env       # any subset works
echo 'ELEVENLABS_API_KEY=...' >> ~/.env

mkdir -p ~/.config/narrate
echo '{"default_provider":"openai","default_voice":"nova"}' > ~/.config/narrate/config.json

brew services restart narrate
narrate "Now I sound much better"

narrate verify shows you which providers are configured. See Provider setup detail for per-provider voice IDs.

Why ~/.env, not ~/.zshrc? Background services (brew services, LaunchAgent, systemd) don't run shell init. ~/.env is the only path that works for both CLI and the server-as-service.

Why narrate

Every coding harness reinvents voice. ElevenLabs has a UI, OpenAI has an API, Cartesia has another API, Voicebox has its own MCP server — and each agent (Claude Code, OpenCode, Pi, Cursor, Cline) has its own way of plugging in. The result: shell scripts that hardcode one provider, hooks that break when you change agents, no shared concept of "voice".

narrate collapses the matrix:

One server, one set of API keys, one set of voice presets.
Three interfaces: HTTP for anything, CLI for shells, MCP for agents that speak the protocol.
Six providers behind a uniform Provider interface — including a proxy to Voicebox for fully local voice cloning.
Voice presets that abstract over providers (narrate --voice researcher works whether researcher is OpenAI Nova or Voicebox Morgan).
Drop into any harness: hook scripts, plugins, MCP — pick whichever your tool supports.

Providers

Provider	Type	Auth	Notes
ElevenLabs	Cloud	`ELEVENLABS_API_KEY`	High quality, premium voices
OpenAI TTS	Cloud	`OPENAI_API_KEY`	`alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`
Google Gemini TTS	Cloud	`GEMINI_API_KEY`	Multilingual, requires `ffmpeg` for PCM→WAV
xAI Grok TTS	Cloud	`XAI_API_KEY`	`eve`, `ara`, `rex`, `sal`, `leo`
Voicebox	Local proxy	none	Auto-detects on `:17493` — voice cloning, 7 local engines, 23 languages
System (`say` / `espeak`)	Local	none	Zero-dep fallback, works offline

Add any subset. narrate uses what you've configured and reports the rest as ⚪ not configured in narrate verify.

Install

macOS — Homebrew (recommended, one command)

brew install felores/narrate/narrate
brew services start narrate          # auto-start at login

That's everything. Bun is pulled in as a dependency. After this you can run narrate "hello" and you'll hear it.

Linux / macOS — curl install

Requires bun first (curl -fsSL https://bun.sh/install | bash).

curl -fsSL https://raw.githubusercontent.com/felores/narrate/main/install.sh -o /tmp/narrate-install.sh
bash /tmp/narrate-install.sh
"$HOME/.local/share/narrate/service/launchd/install.sh"   # macOS
"$HOME/.local/share/narrate/service/systemd/install.sh"   # Linux

Clones to ~/.local/share/narrate, writes wrappers to ~/.local/bin/{narrate,narrate-server}, then installs the auto-start service. Override paths via NARRATE_DIR, BIN_DIR, NARRATE_REF.

Development — git clone

git clone https://github.com/felores/narrate.git ~/Documents/GitHub/narrate
cd ~/Documents/GitHub/narrate
bun install
bun run src/server.ts &
bun run src/cli.ts verify

Where things live

Once installed, the repo + scripts are at one of these paths depending on the method you used:

Install method	`$NARRATE_DIR`	Logs
Homebrew	`$(brew --prefix narrate)/libexec`	`$NARRATE_DIR/logs/narrate.log`
curl install	`~/.local/share/narrate`	`$NARRATE_DIR/logs/narrate.log`
git clone (dev)	wherever you cloned (e.g. `~/Documents/GitHub/narrate`)	`$NARRATE_DIR/logs/narrate.log`

Set it once in your shell init so the recipes below work copy-paste:

# pick the line that matches how you installed
export NARRATE_DIR="$(brew --prefix narrate)/libexec"   # brew
export NARRATE_DIR="$HOME/.local/share/narrate"         # curl
export NARRATE_DIR="$HOME/Documents/GitHub/narrate"     # git clone

The running server reports its own location at GET /health (repo_dir, logs_dir) — useful for plugins and tooling that need to self-locate.

Configure

You can skip this entirely if the Add an API key section above covered your needs. This section is for named voice presets and per-provider tweaks.

Voice presets (`voices.json`)

Map a friendly name to a (provider, voice_id) triple so you can swap providers without touching agent code:

mkdir -p ~/.config/narrate
cp "$NARRATE_DIR/voices.json.example" ~/.config/narrate/voices.json
narrate --voice researcher "Findings ready"   # uses the preset from voices.json

Edit ~/.config/narrate/voices.json to add your own presets. Full schema in voices.json — voice presets.

Custom defaults (`config.json`)

cat > ~/.config/narrate/config.json <<EOF
{
  "default_provider": "openai",
  "default_voice": "researcher",
  "port": 8888
}
EOF
brew services restart narrate

See Configuration precedence for the full resolution chain.

Quickstart by interface

narrate exposes three interfaces. Pick whichever your tool supports.

CLI — `narrate "..."`

Best for shells, hooks, scripts, cron, terminal one-offs.

narrate "Build complete"
narrate --voice engineer "Tests passed"
narrate --provider system --id Samantha "Local fallback"
echo "Long output" | narrate --quiet
narrate verify              # doctor-style health snapshot
narrate verify --test       # also play one sample per configured provider (1 API call each)

HTTP — `POST localhost:8888/notify`

Best for plugin code, webhooks, anything that can fetch.

curl -X POST http://localhost:8888/notify \
  -H 'Content-Type: application/json' \
  -H 'X-Narrate-Client-Id: my-app' \
  -d '{"message":"Build green","voice":"engineer"}'

MCP — `narrate.speak(...)`

Best for AI agents with native tool calling. The agent itself decides when to speak.

# Claude Code one-liner
claude mcp add narrate \
  --transport http \
  --url http://localhost:8888/mcp \
  --header "X-Narrate-Client-Id: claude-code"

Or via .mcp.json in any HTTP MCP client (Cursor, Windsurf, VS Code, Cline):

{
  "mcpServers": {
    "narrate": {
      "url": "http://localhost:8888/mcp",
      "headers": { "X-Narrate-Client-Id": "cursor" }
    }
  }
}

The agent now sees narrate.speak, narrate.list_voices, and narrate.list_providers as tools.

Use it from each harness

Per-harness recipes live under integrations/. Summary:

Harness	Method	Recipe
Claude Code	MCP (recommended) or Stop hook	`integrations/claude-code/`
Cursor / Windsurf / Cline	MCP	`integrations/cursor/`
OpenCode	Plugin (`@opencode-ai/plugin`)	`integrations/opencode/`
Pi (pi-mono)	`agent.subscribe('turn_end')`	`integrations/pi/`
ChatGPT Codex CLI	Wrapper script	`integrations/codex/`
Shell scripts / cron / CI	Direct CLI	`integrations/shell/`

Provider setup detail

ElevenLabs

Sign up at elevenlabs.io → API Keys → create a key.
echo 'ELEVENLABS_API_KEY=your_key' >> ~/.env
Voice IDs: find them at elevenlabs.io/voice-lab (each voice's URL ends in its ID).

Add to voices.json:

"rachel": { "provider": "elevenlabs", "voice_id": "21m00Tcm4TlvDq8ikWAM" }

OpenAI TTS

Get a key at platform.openai.com/api-keys.
echo 'OPENAI_API_KEY=sk-...' >> ~/.env
Six built-in voices (no IDs to look up): alloy, echo, fable, onyx, nova, shimmer.

Optional providerConfig: { "model": "tts-1-hd", "speed": 1.2 } for higher quality / faster speech.

"narrator": {
  "provider": "openai",
  "voice_id": "fable",
  "providerConfig": { "model": "tts-1-hd" }
}

Google Gemini TTS

Get a key at aistudio.google.com/apikey.
echo 'GEMINI_API_KEY=...' >> ~/.env

Install ffmpeg (Gemini returns raw PCM that we convert to WAV):

brew install ffmpeg                     # macOS
sudo apt install ffmpeg                 # Linux

Voice names: Kore, Puck, Charon, Fenrir, Aoede (and others — see Gemini docs).

xAI Grok TTS

Get a key at console.x.ai.
echo 'XAI_API_KEY=...' >> ~/.env
Voice IDs: eve, ara, rex, sal, leo.
Optional: XAI_LANGUAGE=auto (default), XAI_VOICE_ID=ara set as default voice.

Voicebox (local)

See Voicebox deep dive. TLDR:

"$NARRATE_DIR/examples/voicebox-install-macos.sh"
open /Applications/Voicebox.app
# wait for Kokoro model download via Settings → Engines (or another engine)
"$NARRATE_DIR/examples/voicebox-create-profile.sh"     # creates "Bella" profile
narrate --provider voicebox --id Bella "Local voice"

System (`say` / `espeak`)

Zero config on macOS — say is built in. On Linux, install espeak-ng:

sudo apt install espeak-ng     # Debian/Ubuntu
sudo dnf install espeak-ng     # Fedora

Voice names: any voice your system speaks. On macOS:

say -v '?'                      # list all installed voices
narrate --provider system --id Samantha "macOS Samantha"
narrate --provider system --id "Daniel" "British Daniel"

Voicebox deep dive

Voicebox is a local-first desktop app that runs TTS engines on your GPU. narrate uses it as a provider — your agent calls narrate.speak, narrate proxies to voicebox, voicebox plays the audio.

Install

"$NARRATE_DIR/examples/voicebox-install-macos.sh"

(Or download manually from voicebox.sh and drag to /Applications.)

Engine vs profile (gotcha)

Voicebox has two concepts:

Engine = the underlying TTS model (Kokoro, Qwen, Chatterbox, TADA, LuxTTS). Each engine ships preset voices.
Profile = a usable voice instance, either created from a preset or cloned from audio.

/speak only accepts profile names — preset voices have to be promoted to profiles first. Do it via UI, or with the helper:

"$NARRATE_DIR/examples/voicebox-create-profile.sh"                          # creates "Bella" from kokoro/af_bella
"$NARRATE_DIR/examples/voicebox-create-profile.sh" Adam kokoro am_adam en
"$NARRATE_DIR/examples/voicebox-create-profile.sh" Dora kokoro ef_dora es
"$NARRATE_DIR/examples/voicebox-create-profile.sh" George kokoro bm_george en

Multi-language behavior

Kokoro voices are flexible: the same profile can speak any of Kokoro's 8 languages depending on what language you pass to /speak. Voices are style vectors at the model level — they describe a timbre, not a language. Pointing them at a different language is supported.

A kokoro/ef_dora-backed profile created with language: "es" speaks natural Spanish.
The same Dora profile asked to speak language: "en" speaks English with a Spanish accent (her trained timbre + English phonetics).
A kokoro/af_bella-backed profile (en-trained) asked to speak language: "es" speaks Spanish with Bella's American voice timbre but proper Spanish phonetics — this is the way to make Bella speak Spanish naturally.
narrate's voicebox provider resolves profile.language automatically (cached 60s) as the default. Override per-call with --language es (CLI), providerConfig.language: "es" (POST body or voices.json), or pin a preset:

"bella_es": {
  "provider": "voicebox",
  "voice_id": "Bella",
  "providerConfig": { "language": "es" }
}

Available Kokoro presets at a glance

50 presets total. Some highlights:

Preset	Name	Language / accent
`af_bella`, `af_nova`, `af_sky`, `af_nicole`	various	en-female (US)
`am_adam`, `am_onyx`, `am_echo`	Adam, Onyx, Echo	en-male (US)
`bf_emma`, `bf_alice`	Emma, Alice	en-female (UK)
`bm_george`, `bm_daniel`	George, Daniel	en-male (UK)
`ef_dora`, `em_alex`	Dora, Alex	es female / male
`ff_siwis`	Siwis	fr female
`hf_alpha`, `hm_omega`	various	hi female / male
`jf_alpha`, `jm_kumo`	various	ja female / male
`zf_xiaoxiao`, others	various	zh female

Full list: curl http://127.0.0.1:17493/profiles/presets/kokoro.

voices.json — voice presets

Map a friendly name to a (provider, voice_id, options) triple so you can swap providers without touching agent code.

v2 schema (current)

{
  "default_voice": "fred",
  "default_rate": 175,
  "voices": {
    "fred":      { "provider": "elevenlabs", "voice_id": "s3TPKV1kjDlVtZbl4Ksh" },
    "researcher":{ "provider": "openai",     "voice_id": "nova"     },
    "engineer":  { "provider": "openai",     "voice_id": "alloy"    },
    "narrator":  { "provider": "openai",     "voice_id": "fable",
                   "providerConfig": { "model": "tts-1-hd" } },
    "ara":       { "provider": "xai",        "voice_id": "ara"      },
    "kore":      { "provider": "gemini",     "voice_id": "Kore"     },
    "bella":     { "provider": "voicebox",   "voice_id": "Bella"    },
    "dora":      { "provider": "voicebox",   "voice_id": "Dora"     },
    "samantha":  { "provider": "system",     "voice_id": "Samantha" }
  }
}

Use it with the preset name: narrate --voice dora "Hola".

v1 backward-compat

If your voices.json only has voice_name per entry (no provider field), narrate auto-assumes provider: "system" (the v1 schema was for macOS say). You'll see a one-line warning at startup.

Per-preset providerConfig

Each provider accepts extra options under providerConfig:

Provider	Useful keys
ElevenLabs	`model_id`, `voice_settings: {stability, similarity_boost, style, use_speaker_boost}`
OpenAI	`model` (`tts-1` / `tts-1-hd`), `speed` (0.25–4.0)
Gemini	`model`
xAI	`language`, `sample_rate`, `bit_rate`, `codec`
Voicebox	`language`, `instruct` (Qwen CustomVoice natural-language delivery), `personality` (boolean), `return_audio` (use `/generate` instead of `/speak`)
System	`rate`

CLI reference

narrate [options] "text to speak"
narrate verify [--test]
echo "text" | narrate [options]

Options:
  -v, --voice NAME      Voice preset from voices.json (e.g. fred, researcher)
  -i, --id ID           Raw provider voice id (bypasses preset registry)
  -p, --provider NAME   elevenlabs | openai | gemini | xai | voicebox | system
  -l, --language LANG   Force generation language (e.g. es, en, ja, fr).
                        Useful with cross-language voices: a Kokoro Bella
                        (en-trained) speaks proper Spanish phonetics with
                        --language es, since Kokoro is multilingual at the
                        model level.
  --instruct TEXT       Natural-language delivery hint (Qwen CustomVoice
                        only). E.g. "warm conversational tone",
                        "broadcast news quality", "speak slowly with
                        emphasis". Other engines ignore this flag.
  -u, --url URL         Server URL (default http://localhost:8888)
  -q, --quiet           Suppress output
  -h, --help            Show help

Subcommands:
  verify                Health snapshot — server status, provider matrix, voices
  verify --test         Also play one sample per configured provider (1 API call each)

Env:
  NARRATE_URL           Override default server URL
  NARRATE_VOICE         Default preset (fallback for omitted --voice)

--language and --instruct forward as providerConfig.{language,instruct} and override both preset providerConfig and the voicebox provider's auto-resolved profile defaults.

# Bella is en-trained, but Kokoro can aim her at Spanish phonetics:
narrate --provider voicebox --id Bella --language es "Hola, soy Bella en español"

# Qwen Ryan with delivery direction:
narrate --provider voicebox --id Ryan --instruct "broadcast news quality" "Headlines tonight"

HTTP API reference

`POST /notify`

Speak text. Returns immediately; audio plays asynchronously.

Body:

Field	Type	Required	Notes
`message`	string	yes	Up to 5000 chars, no control characters
`voice`	string	no	Preset name from voices.json
`voice_id`	string	no	Raw provider voice id (bypasses presets)
`voice_name`	string	no	Legacy alias for `voice_id`
`provider`	string	no	Override default provider
`voice_enabled`	boolean	no (default `true`)	If `false`, returns `{status: "ok", message: "voice_enabled=false; nothing to do"}`
`providerConfig`	object	no	Per-provider passthrough config (see provider table above)

Headers:

Header	Purpose
`X-Narrate-Client-Id`	Client identifier (logged + future per-client routing)

Response (200):

{ "status": "success", "provider": "openai", "voice": "alloy", "format": "mp3", "delegated": false }

delegated: true means the provider played the audio itself (voicebox, system) and narrate skipped local playback.

`POST /pai`

Legacy alias for /notify (PAI Voice compatibility).

`GET /health`

Server + provider snapshot.

{
  "status": "healthy",
  "port": 8888,
  "default_provider": "xai",
  "default_voice": "ara",
  "voices_path": "/Users/you/.config/narrate/voices.json",
  "voices": ["fred", "researcher", "engineer", ...],
  "providers": {
    "elevenlabs": { "configured": true },
    "openai": { "configured": true },
    "gemini": { "configured": true },
    "xai": { "configured": true },
    "voicebox": { "configured": true },
    "system": { "configured": true }
  }
}

`GET /voices`

Full voices.json contents.

{
  "default_voice": "fred",
  "default_rate": 175,
  "voices": { "fred": { ... }, "researcher": { ... } }
}

`POST /mcp`

MCP Streamable HTTP endpoint. JSON-RPC 2.0. See MCP tools reference.

MCP tools reference

Three tools available via the MCP server at /mcp:

`speak`

narrate.speak({
  text: string,                  // required, max 5000
  voice?: string,                // preset name from voices.json
  voice_id?: string,             // raw provider voice id
  provider?: "elevenlabs" | "openai" | "gemini" | "xai" | "voicebox" | "system"
}) -> "Spoken via <provider> (voice=<voice>, format=<fmt>, delegated playback)"

`list_voices`

narrate.list_voices() -> Array<{ name, provider, voice_id, description }>

Returns all voice presets from voices.json.

`list_providers`

narrate.list_providers() -> Array<{ name, label, configured, reason? }>

Returns the provider health matrix — same data as GET /health's providers field.

Discover via JSON-RPC

# tools/list
curl -X POST http://localhost:8888/mcp \
  -H 'Content-Type: application/json' \
  -H 'Accept: application/json, text/event-stream' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'

# tools/call
curl -X POST http://localhost:8888/mcp \
  -H 'Content-Type: application/json' \
  -H 'Accept: application/json, text/event-stream' \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"speak","arguments":{"text":"Hello","voice":"researcher"}}}'

Configuration precedence

Higher rows win. narrate reads each layer at startup; mid-flight changes need a server restart.

#	Layer	Used for
1	CLI flags / POST body / MCP tool args	per-call provider, voice, providerConfig
2	`~/.config/narrate/config.json`	default_provider, default_voice, port, voices_path
3	`NARRATE_*` env vars	`NARRATE_PORT`, `NARRATE_PROVIDER`, `NARRATE_VOICE`, `NARRATE_VOICES_PATH`, `NARRATE_URL` (CLI only)
4	`~/.claude/settings.json` (legacy compat)	`TTS_PROVIDER` and `DA_VOICE_ID`/`NARRATE_VOICE_ID` are read for backward-compat
5	`~/.env`	API keys (`ELEVENLABS_API_KEY`, etc.) auto-loaded if present
6	Built-in defaults	`port: 8888`, `default_provider: "elevenlabs"`, `default_rate: 175`

API keys come from process.env (loaded from your shell or auto-loaded from ~/.env). Never put them in config.json or voices.json.

Run as a service

macOS (launchd)

brew services start narrate              # if installed via Homebrew
"$NARRATE_DIR/service/launchd/install.sh" # if installed via curl/git

The installer:

Renders com.narrate.server.plist from a template ($HOME and $NARRATE_DIR substituted at install time, with a static PATH of <bun_dir>:/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin).
Drops it at ~/Library/LaunchAgents/.
Loads it with launchctl.
Verifies it's running.

To remove:

brew services stop narrate
"$NARRATE_DIR/service/launchd/uninstall.sh"

Linux (systemd)

"$NARRATE_DIR/service/systemd/install.sh"

Installs as a user service (~/.config/systemd/user/narrate.service) and runs systemctl --user enable --now.

To remove:

"$NARRATE_DIR/service/systemd/uninstall.sh"

Logging and observability

Live logs

File	What
`logs/narrate.log`	All requests, with timestamp, provider, voice, latency, client id
`logs/narrate-error.log`	Errors
`logs/launchd-stdout.log`	Pre-init startup output (small, only grows on crashes)
`logs/launchd-stderr.log`	Same for stderr

# follow live request log (resolve the path via /health if you don't know it)
LOGS_DIR="$(curl -s localhost:8888/health | python3 -c 'import sys,json;print(json.load(sys.stdin)["logs_dir"])')"
tail -f "$LOGS_DIR/narrate.log"

# or if you set $NARRATE_DIR per "Where things live":
tail -f "$NARRATE_DIR/logs/narrate.log"

# example line
2026-04-27T23:44:36.733Z [/notify] → provider=voicebox voice=Dora bytes=42 from=localhost client=- ua=Bun/1.2.10
2026-04-27T23:44:36.755Z [/notify] ✅ 25ms provider=voicebox voice=Dora format=mp3 delegated=true

Log rotation

In-process rotation. Defaults: 10 MiB per file, keep last 5 (narrate.log → narrate.log.1 → ... → narrate.log.5).

# tune via env (read once at server start)
NARRATE_LOG_MAX_BYTES=20971520 NARRATE_LOG_KEEP=10 narrate-server

# disable entirely (use raw stdout/stderr — useful for `bun run` dev mode)
NARRATE_LOG_DISABLED=1 narrate-server

`narrate verify` doctor

narrate verify
narrate verify --test    # also play 1 sample per configured provider

Prints server health, default provider/voice, voices file path, preset list, and per-provider configured/reason status.

Architecture

┌────────────────────────────────────────────────────────────┐
│                      narrate (Bun process)                 │
│                                                            │
│   HTTP server (port 8888)                                  │
│   ├─ POST /notify    POST /pai (legacy)                    │
│   ├─ GET  /health    GET  /voices                          │
│   └─ POST /mcp       (MCP Streamable HTTP)                 │
│                                                            │
│            │                                               │
│            ▼                                               │
│   handleNotify()                                           │
│            │                                               │
│            ▼                                               │
│   Provider registry  (ALL_PROVIDERS)                       │
│   ┌──────────────┬──────────────┬────────────┐             │
│   │ ElevenLabs   │ OpenAI       │ Gemini     │  cloud      │
│   ├──────────────┼──────────────┼────────────┤             │
│   │ xAI          │ Voicebox     │ System     │  cloud/local│
│   └──────────────┴──────────────┴────────────┘             │
│            │                                               │
│            ▼                                               │
│   ArrayBuffer  (or delegated=true)                         │
│            │                                               │
│            ▼                                               │
│   playback.ts → afplay (macOS) / ffplay (Linux)            │
└────────────────────────────────────────────────────────────┘

Each Provider (in src/providers/) implements a small interface:

interface Provider {
  name: string;
  label: string;
  health(): Promise<ProviderHealth>;
  generateSpeech(text: string, voice: string, opts?: ProviderOptions): Promise<AudioResult>;
  listVoices?(): Promise<VoiceInfo[]>;
}

Provider implementations talk to their respective APIs (or local services like voicebox :17493). The result is either an ArrayBuffer (cloud — narrate plays it locally via playback.ts) or delegated: true (voicebox, system — they handled playback themselves).

The MCP server is a thin wrapper: it registers narrate.speak, narrate.list_voices, narrate.list_providers as tools, and the speak tool calls the same handleNotify function as the HTTP handler. One code path, three interfaces.

Project layout

narrate/
├── src/
│   ├── providers/
│   │   ├── base.ts              # Provider interface, types
│   │   ├── elevenlabs.ts
│   │   ├── openai.ts
│   │   ├── gemini.ts
│   │   ├── xai.ts
│   │   ├── voicebox.ts
│   │   ├── system.ts
│   │   └── index.ts             # registry
│   ├── voices.ts                # voices.json loader (v1 → v2 compat)
│   ├── config.ts                # XDG config + env vars + ~/.claude/settings.json shim
│   ├── playback.ts              # afplay / ffplay
│   ├── logger.ts                # rotating file logger
│   ├── mcp.ts                   # MCP server (Streamable HTTP)
│   ├── server.ts                # HTTP server
│   └── cli.ts                   # narrate CLI
├── integrations/                # one folder per harness with real refs
│   ├── claude-code/
│   ├── opencode/
│   ├── pi/
│   ├── codex/
│   ├── cursor/
│   └── shell/
├── service/
│   ├── launchd/                 # macOS install + plist template
│   └── systemd/                 # Linux install + unit template
├── examples/
│   ├── config.example.json
│   ├── voicebox-install-macos.sh
│   └── voicebox-create-profile.sh
├── voices.json.example
├── install.sh                   # curl install entry point
├── package.json
├── tsconfig.json
├── README.md
├── CHANGELOG.md
├── LICENSE
└── .github/workflows/           # CI (TBD)

narrate vs voicebox

Voicebox is a full local-first TTS studio with on-device inference, voice cloning, dictation, MCP server, and 7 local engines. It's a desktop app.

narrate is a thin gateway. They compose — voicebox is one of narrate's providers.

	narrate	voicebox
Form factor	CLI + HTTP server + MCP	Desktop app (Tauri)
Engines	Cloud + voicebox proxy + system	7 local engines (MLX/CUDA)
Voice cloning	No (uses provider voices)	Yes (zero-shot)
Dictation (STT)	No	Yes (Whisper hotkey)
MCP server	Yes (`/mcp`)	Yes (`/mcp` on :17493)
Footprint	< 1 MB + bun	GB of models
Best for	Drop into any agent or shell	Privacy-first studio workflows

Use narrate when you want one command that any harness or shell can call, mixing cloud and local providers. Use voicebox when you want fully local, GPU-accelerated voice. Use both when you want voicebox's quality plus narrate's harness-agnostic gateway.

Roadmap

Status	Item
✅ v0.1.0	6 providers, CLI, HTTP server, voices.json v2, launchd + systemd
✅ v0.2.0	Per-request observability, `narrate verify`, real OpenCode + Pi integrations, voicebox install helper
✅ v0.3.0	MCP server (`/mcp`), curl install script, Homebrew tap, voicebox profile helper, multi-language fix
✅ v0.3.1	In-process log rotation
✅ v0.3.2	Voicebox `instruct` passthrough (Qwen natural-language delivery)
✅ v0.3.3	CLI `--language` and `--instruct` flags
✅ v0.3.4	SwiftBar / xbar menubar plugin
✅ v0.3.5	Portability fixes — `/health` exposes `repo_dir`/`logs_dir`, plugin auto-locates, SwiftBar Login Items autostart, plist drops `$PATH` snapshot
✅ v0.3.6	First-run UX: default provider is `system` so fresh installs work without API keys. README rewritten for non-technical users with a 3-command quickstart at the top.
Planned v0.4	Pre-built single-binary releases (`bun build --compile` per platform)
Planned v0.5	More providers (Cartesia, Hume EVI, Azure TTS)
Planned v0.6	`--direct` CLI mode (skip server, call providers directly)
Planned v0.7	Streaming TTS over WebSocket
Planned v0.8	Auth tokens for `/notify` and `/mcp` (currently localhost-only)
Planned v1.0	Test suite, GitHub Actions CI, npm publish

Troubleshooting

`narrate verify` says provider X is `⚪ not configured`

Cloud provider: API key env var not set. cat ~/.env | grep <PROVIDER>_API_KEY. Restart the server after adding (brew services restart narrate or relaunch LaunchAgent).
Voicebox: app not running, or running on a non-default port. Open /Applications/Voicebox.app. If on a different port, set VOICEBOX_URL=http://127.0.0.1:NNNNN.
System on Linux: install espeak-ng.

Server logs show `[xai] 404 Voice 'Samantha' not found`

The default provider is whatever ~/.claude/settings.json says (or default_provider in config.json). When you pass --id Samantha without --provider system, narrate uses the default provider — which doesn't know about Samantha. Either:

narrate --provider system --id Samantha "..." (explicit provider)
narrate --voice samantha "..." (preset that bundles provider + voice id)

Voicebox profile speaks the wrong language

Solved in v0.3.0 (aede995): voicebox's /speak doesn't auto-pull language from the profile, it defaults to "en". narrate now resolves and passes profile.language automatically. If still wrong, force it via providerConfig.language:

"dora_es": {
  "provider": "voicebox", "voice_id": "Dora",
  "providerConfig": { "language": "es" }
}

Two `narrate` binaries on PATH

If you both brew install narrate AND ran the curl install, you have /opt/homebrew/bin/narrate and ~/.local/bin/narrate. Both work; PATH order decides which wins. Pick one and remove the other.

Logs are massive

Tune rotation:

# in your shell init or LaunchAgent EnvironmentVariables
NARRATE_LOG_MAX_BYTES=2097152    # 2 MiB
NARRATE_LOG_KEEP=3

Or disable entirely:

NARRATE_LOG_DISABLED=1

"Stateless transport cannot be reused" on `/mcp`

Already fixed in v0.3.0 (a5aaa14). If you see this, your local install is pre-fix — pull main and reload.

Contributing

git clone https://github.com/felores/narrate.git
cd narrate
bun install
bun run --watch src/server.ts                      # hot-reload dev mode
./node_modules/.bin/tsc --noEmit                   # typecheck

To add a new TTS provider:

Create src/providers/<name>.ts implementing the Provider interface from src/providers/base.ts.
Register it in src/providers/index.ts.
Add an integration test in narrate verify --test (the sampleVoiceFor map).
Document it in this README's Provider setup detail.

PRs welcome. Issues: https://github.com/felores/narrate/issues

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
examples		examples
integrations		integrations
service		service
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
install.sh		install.sh
package.json		package.json
tsconfig.json		tsconfig.json
voices.json.example		voices.json.example

Folders and files

Latest commit

History

Repository files navigation

narrate

60-second quickstart (macOS)

Add an API key

Why narrate

Providers

Install

macOS — Homebrew (recommended, one command)

Linux / macOS — curl install

Development — git clone

Where things live

Configure

Voice presets (voices.json)

Custom defaults (config.json)

Quickstart by interface

CLI — narrate "..."

HTTP — POST localhost:8888/notify

MCP — narrate.speak(...)

Use it from each harness

Provider setup detail

ElevenLabs

OpenAI TTS

Google Gemini TTS

xAI Grok TTS

Voicebox (local)

System (say / espeak)

Voicebox deep dive

Install

Engine vs profile (gotcha)

Multi-language behavior

Available Kokoro presets at a glance

voices.json — voice presets

v2 schema (current)

v1 backward-compat

Per-preset providerConfig

CLI reference

HTTP API reference

POST /notify

POST /pai

GET /health

GET /voices

POST /mcp

MCP tools reference

speak

list_voices

list_providers

Discover via JSON-RPC

Configuration precedence

Run as a service

macOS (launchd)

Linux (systemd)

Logging and observability

Live logs

Log rotation

narrate verify doctor

Architecture

Project layout

narrate vs voicebox

Roadmap

Troubleshooting

narrate verify says provider X is ⚪ not configured

Server logs show [xai] 404 Voice 'Samantha' not found

Voicebox profile speaks the wrong language

Two narrate binaries on PATH

Logs are massive

"Stateless transport cannot be reused" on /mcp

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Voice presets (`voices.json`)

Custom defaults (`config.json`)

CLI — `narrate "..."`

HTTP — `POST localhost:8888/notify`

MCP — `narrate.speak(...)`

System (`say` / `espeak`)

`POST /notify`

`POST /pai`

`GET /health`

`GET /voices`

`POST /mcp`

`speak`

`list_voices`

`list_providers`

`narrate verify` doctor

`narrate verify` says provider X is `⚪ not configured`

Server logs show `[xai] 404 Voice 'Samantha' not found`

Two `narrate` binaries on PATH

"Stateless transport cannot be reused" on `/mcp`

Packages