STT

Like SuperWhisper, but free. Like Wispr Flow, but local.

Hold a key, speak, release — your words appear wherever your cursor is. Built for vibe coding and conversations with AI agents.

Free & open source — no subscription, no cloud dependency
Runs locally on Apple Silicon via MLX Whisper or Parakeet
Or use cloud (Groq) when you prefer
One command install — uv tool install git+https://github.com/bokan/stt.git

Features

Global hotkey — works in any application, configurable trigger key
Hold-to-record — no start/stop buttons, just hold and speak
Auto-type — transcribed text is typed directly into the active field
Shift+record — automatically sends Enter after typing (great for chat interfaces)
Audio feedback — subtle system sounds confirm recording state (can be disabled)
Silence detection — automatically skips transcription when no speech detected
Slash commands — say "slash help" to type /help
Context prompts — improve accuracy with domain-specific vocabulary
Auto-updates — notifies when a new version is available

Requirements

macOS with Apple Silicon (M1/M2/M3/M4)
UV package manager
For cloud mode (optional): Groq API key

Installation

uv tool install git+https://github.com/bokan/stt.git

On first run, a setup wizard will guide you through configuration.

To update:

uv tool install --reinstall git+https://github.com/bokan/stt.git

Permissions

STT needs macOS permissions to capture the global hotkey and type text into other apps.

Grant these to your terminal app (iTerm2, Terminal, Warp, etc.) — not "stt":

Accessibility — System Settings → Privacy & Security → Accessibility
Input Monitoring — System Settings → Privacy & Security → Input Monitoring

Usage

stt

Action	Keys
Record	Hold Right Command (default)
Record + Enter	Hold Shift while recording
Cancel recording / stuck transcription	ESC
Quit	Ctrl+C

Configuration

Settings are stored in ~/.config/stt/.env. Run stt --config to reconfigure, or edit directly:

# Transcription provider: "mlx" (default), "whisper-cpp-http", "parakeet", or "groq"
PROVIDER=mlx

# Local HTTP server URL (default: http://localhost:8080)
WHISPER_CPP_HTTP_URL=http://localhost:8080

# Required for cloud mode only
GROQ_API_KEY=gsk_...

# Audio device (saved automatically after first selection; device name, not index)
AUDIO_DEVICE=MacBook Pro Microphone

# Language code for transcription
LANGUAGE=en

# Trigger key: cmd_r, cmd_l, alt_r, alt_l, ctrl_r, ctrl_l, shift_r
HOTKEY=cmd_r

# Context prompt to improve accuracy for specific terms
PROMPT=Claude, Anthropic, TypeScript, React, Python

# Disable audio feedback sounds
SOUND_ENABLED=true

Prompt Overlay (Optional)

STT includes a prompt overlay (triggered by Right Option by default) for quickly pasting common prompts.

Prompts live in:

~/.config/stt/prompts/*.md

Local Mode (MLX Whisper) — Default

Local transcription uses Apple Silicon GPU acceleration via MLX. On first run, the Whisper large-v3 model (~3GB) will be downloaded and cached. Subsequent runs load from cache.

Runs completely offline — no API key required. Supports 99 languages and context prompts.

Local Mode (Parakeet)

Nvidia's Parakeet model via MLX. Faster than Whisper (~3000x realtime factor) with comparable accuracy.

PROVIDER=parakeet

On first run, the model (~2.5GB) will be downloaded and cached.

Limitations:

English only

Phonetic correction: While Parakeet doesn't support Whisper-style prompts, it uses the PROMPT setting for phonetic post-processing. Terms like Claude Code, WezTerm will correct sound-alike ASR errors (e.g., "cloud code" → "Claude Code", "Vez term" → "WezTerm").

Cloud Mode (Groq)

To use cloud transcription instead:

PROVIDER=groq
GROQ_API_KEY=gsk_...

Requires a Groq API key (free tier available).

HTTP Mode (Local Server)

Run a local HTTP server with Whisper transcription. Useful for performance or custom integration.

PROVIDER=whisper-cpp-http
WHISPER_CPP_HTTP_URL=http://localhost:8080

Start the server:

# Terminal 1: Start the whisper.cpp server
./whisper-server -m models/ggml-large-v3.bin -f

# Or run in background with a custom port
./whisper-server -m models/ggml-large-v3.bin -f -t 4 -ngl 32 -p 8080

The server provides a whisper.cpp-compatible endpoint:

curl -X POST http://localhost:8080/inference \
  -H "Content-Type: multipart/form-data" \
  -F "file=@audio.wav" \
  -F "language=en"

Benefits:

Fast HTTP API for integrating with other services
Reuse whisper.cpp model across multiple applications
Hardware accelerated on CPU/NVIDIA
Configurable temperature, model, and decoding options

Prompt Examples

The PROMPT setting helps Whisper recognize domain-specific terms:

# Programming
PROMPT=TypeScript, React, useState, async await, API endpoint

# AI tools
PROMPT=Claude, Anthropic, OpenAI, Groq, LLM, GPT

Development

git clone https://github.com/bokan/stt.git
cd stt
uv sync
uv run stt

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
audio_worker.py		audio_worker.py
audio_worker_client.py		audio_worker_client.py
demo.gif		demo.gif
input_controller.py		input_controller.py
issue_capture.py		issue_capture.py
menubar.py		menubar.py
mlx_worker.py		mlx_worker.py
onboarding.py		onboarding.py
overlay.py		overlay.py
parakeet_worker.py		parakeet_worker.py
postprocess.py		postprocess.py
prompt_overlay.py		prompt_overlay.py
prompts_config.py		prompts_config.py
providers.py		providers.py
pyproject.toml		pyproject.toml
recordings.py		recordings.py
requirements.txt		requirements.txt
stress_recording.py		stress_recording.py
stt.py		stt.py
stt_app.py		stt_app.py
stt_config.py		stt_config.py
stt_defaults.py		stt_defaults.py
test_overlay_harness.py		test_overlay_harness.py
text_injector.py		text_injector.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STT

Features

Requirements

Installation

Permissions

Usage

Configuration

Prompt Overlay (Optional)

Local Mode (MLX Whisper) — Default

Local Mode (Parakeet)

Cloud Mode (Groq)

HTTP Mode (Local Server)

Prompt Examples

Development

License

About

Uh oh!

Releases 9

Packages

Languages

License

bokan/stt

Folders and files

Latest commit

History

Repository files navigation

STT

Features

Requirements

Installation

Permissions

Usage

Configuration

Prompt Overlay (Optional)

Local Mode (MLX Whisper) — Default

Local Mode (Parakeet)

Cloud Mode (Groq)

HTTP Mode (Local Server)

Prompt Examples

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Languages

Packages