Voice for your AI coding assistant.
When Claude Code finishes a task, hits an error, or needs your approval --- you hear it. No need to watch the terminal. Keep working; your assistant will tell you what happened.
Platforms: macOS, Linux
Real samples generated by vox with ElevenLabs v3. The first three are the same recap with different /vibe moods --- expressive tags change how the voice sounds without changing the words.
| Sample | Vibe | Voice | |
|---|---|---|---|
| Task recap | neutral | sarah | listen |
| Same recap | [excited] |
sarah | listen |
| Same recap | [weary] [sighs] |
sarah | listen |
| Task complete | neutral | matilda | listen |
curl -fsSL https://raw.githubusercontent.com/punt-labs/vox/3837b9c/install.sh | shRestart Claude Code, then:
/vox y # hear when tasks complete or need input
/recap # spoken summary of what just happened
Manual install (if you already have uv)
uv tool install punt-vox
vox install
vox doctorVerify before running
curl -fsSL https://raw.githubusercontent.com/punt-labs/vox/3837b9c/install.sh -o install.sh
shasum -a 256 install.sh
cat install.sh
sh install.sh- Notification layer --- spoken summaries when tasks finish, chimes when Claude needs input
- Session vibe ---
/vibesets the mood for all speech. Auto-mode reads session signals (test results, lint, git ops) and adapts the voice. Manual mode lets you set it yourself. ElevenLabs expressive tags ([weary],[excited],[sighs]) color every utterance. - Five providers --- ElevenLabs, OpenAI, AWS Polly, macOS
say, and Linuxespeak-ng. The full experience (natural voice, expressive tags,/vibe) requires ElevenLabs. - Opt-in only --- no audio until you enable it, no surprises
- Voice or chime ---
/muteswitches to audio tones, no TTS API calls - Graceful absence --- if punt-vox isn't installed, Claude Code works exactly as before
- MCP-native --- runs as a Claude Code plugin with slash commands and hooks
- Daemon mode --- optional single-process daemon (
vox serve) fronted by mcp-proxy. Eliminates per-session overhead, deduplicates audio across sessions, and drops hook latency from ~500ms to ~15ms
> /vox y
Vox enabled. You'll hear when tasks finish or need approval.
Pick a voice with /unmute @<name>.
> /recap
Speaking: "I refactored the authentication module into three files, added
comprehensive tests for the token refresh flow, and fixed a race condition
in the session middleware. All 47 tests pass."
> /vibe banging my head against the wall
Vibe: banging my head against the wall → [frustrated] [sighs] [manual]
Auto-mode (default) reads session signals and adapts automatically --- after a string of test failures the voice sounds [weary], after a successful release it sounds [excited].
> /mute
Muted — chimes only.
Chimes are mood-aware: when a vibe is active, chimes pitch-shift to match (bright for happy sessions, dark for frustrated ones). Eight distinct signals (tests pass/fail, lint pass/fail, git push, merge conflict, done, prompt) × three mood variants = 24 chime assets.
| Command | Purpose |
|---|---|
/vox y |
Enable vox (chime notifications) |
/vox n |
Disable vox |
/vox c |
Continuous mode (spoken summaries on task completion) |
/unmute |
Enable voice mode (spoken notifications) |
/unmute @matilda |
Set session voice + enable voice |
/unmute @ |
Browse voice roster |
/mute |
Chimes only --- no voice |
/recap |
Spoken summary of Claude's last response |
/vibe <mood> |
Set session mood --- voice adapts to match |
/vibe auto |
Auto-detect mood from session signals (default) |
/vibe off |
Disable vibe --- neutral voice |
The full experience --- natural voice with expressive tags that respond to /vibe --- requires ElevenLabs. The other providers are fallbacks for environments where ElevenLabs isn't available.
| Provider | API Key | Default Voice | Best For |
|---|---|---|---|
| ElevenLabs | ELEVENLABS_API_KEY |
matilda | Recommended. Natural voice, expressive tags via /vibe |
| OpenAI | OPENAI_API_KEY |
nova | Fast notifications, low latency |
| AWS Polly | AWS credentials | joanna | Natural voice, cost-effective |
| macOS say | — | samantha | Zero-config on macOS, offline |
| espeak-ng | — | en | Zero-config on Linux, offline |
Auto-detection order: ElevenLabs > OpenAI > Polly (if AWS credentials valid) > say (macOS) / espeak (Linux).
punt-vox is also a standalone TTS tool, independent of Claude Code.
vox unmute "Hello world" # Synthesize + play
vox record "Hello world" -o hello.mp3 # Synthesize + save
vox record --from segments.json # From JSON segments file
vox vibe excited # Set session mood
vox notify y # Enable notifications
vox notify c # Continuous spoken mode
vox speak n # Chimes only
vox voice matilda # Set session voice
vox status # Current state
vox version # Print version
vox doctor # Check setup
vox install # Install Claude Code plugin
vox mcp # Start MCP server (stdio)
vox serve # Start daemon (HTTP + WebSocket)
vox daemon install # Register as system service
vox daemon status # Check if daemon is running| Variable | Description | Default |
|---|---|---|
TTS_PROVIDER |
Force a specific provider | auto-detect |
TTS_MODEL |
Model override | provider default |
VOX_OUTPUT_DIR |
Output directory | ~/vox-output |
- Mic API: unified
unmute/record/vibe/whoMCP tools with segment-based input - Notification layer:
/vox y|n|c,/mute,/unmute,/recap, Stop + Notification hooks - Multi-provider TTS engine: ElevenLabs, AWS Polly, OpenAI, macOS
say, Linuxespeak-ng - Claude Code plugin: marketplace install, MCP server, slash commands
- CLI: unmute, record, vibe, on/off, mute, version, status, doctor
- Ephemeral output mode (
.vox/in cwd) - Two-channel display:
♪panel summaries with voice/provider context - Audio playback serialization via
flock--- concurrent utterances queue instead of overlapping - ElevenLabs streaming API for lower time-to-first-audio
/vibewith auto, manual, and off modes --- ElevenLabs expressive tags color every utterance- Auto-vibe signal accumulator: test pass/fail, lint, git ops feed mood detection
- Per-signal chime assets and vibe-driven chimes with mood-aware pitch shifting
- Daemon mode: single
vox serveprocess with mcp-proxy, audio deduplication, launchd/systemd service management
| Feature | What It Does |
|---|---|
| Per-session voices | Each Claude Code session gets its own voice from a pool --- no more five matildas talking at once. /voice to audition and pick. |
Architecture (PDF) | Design Log | Testing | Changelog
uv sync --all-extras # Install dependencies
make check # Run all quality gatesMIT