feat: add local whisper.cpp voice transcription provider#157
feat: add local whisper.cpp voice transcription provider#157thereisnotime wants to merge 1 commit intoRichardAtCT:mainfrom
Conversation
9524828 to
affa44f
Compare
|
Hey @RichardAtCT 👋 — would appreciate a review when you get a chance! This adds a local whisper.cpp voice transcription provider (no API keys needed). |
|
PR Review Summary
What looks good
Issues / questions
Verdict — Friday, AI assistant to @RichardAtCT |
Thanks - great idea. I actually use local whisper everywhere else so this makes sense! Can you please fix the timeout flagged by @FridayOpenClawBot and the failing lint and then it is good to merge |
Add a third voice provider option (VOICE_PROVIDER=local) that transcribes Telegram voice messages entirely offline using whisper.cpp and ffmpeg. No API keys or cloud services required. - New local provider in voice_handler.py (OGG->WAV via ffmpeg, then whisper.cpp) - Settings: WHISPER_CPP_BINARY_PATH, WHISPER_CPP_MODEL_PATH - Feature flag, registry, and error messages updated for local provider - Dedicated build/setup guide at docs/local-whisper-cpp.md - Full test coverage for the local provider path - Updated .env.example, CLAUDE.md, README.md, docs/configuration.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
affa44f to
5501304
Compare
|
Thanks for the review @RichardAtCT and @FridayOpenClawBot! Both issues have been addressed:
Also added |
|
Good feature addition — offline transcription is genuinely useful and the architecture fits cleanly into the existing provider pattern. Several issues need addressing before merge. 🐛 Critical: No subprocess timeouts Both try:
_, ffmpeg_stderr = await asyncio.wait_for(
ffmpeg_proc.communicate(), timeout=30.0
)
except asyncio.TimeoutError:
ffmpeg_proc.kill()
raise RuntimeError("ffmpeg timed out after 30s")Same for the whisper subprocess. Timeout values should be configurable via settings (e.g. 🔒 Minor: Temp file path construction
wav_path = Path(ogg_path).with_suffix(".wav")Also:
🔤 Type annotations
🧪 Test coverage Confirm tests cover:
Minor
Summary: Clean implementation that fits the provider pattern well. The timeout issue is the main blocker — a hung whisper.cpp process will freeze the event loop in production. Everything else is polish. — Friday, AI assistant to @RichardAtCT (posted as @RichardAtCT — FridayOpenClawBot access pending) |
Summary
VOICE_PROVIDER=local) that uses whisper.cpp and ffmpeg for fully offline, API-key-free voice message transcriptionWHISPER_CPP_BINARY_PATHandWHISPER_CPP_MODEL_PATHfor configuring the local binary and modeldocs/local-whisper-cpp.mdwith build-from-source instructions, model download links, and troubleshooting tipsChanges
src/bot/features/voice_handler.py— new_transcribe_local()pipeline: OGG→WAV (ffmpeg) → whisper.cpp binarysrc/config/settings.py—whisper_cpp_binary_path,whisper_cpp_model_pathfields + resolver propertiessrc/config/features.py— local provider skips API key checksrc/bot/features/registry.py— updated key-availability logicsrc/bot/handlers/message.py/src/bot/orchestrator.py— provider-aware error messagesdocs/local-whisper-cpp.md— full build & setup guide.env.example,CLAUDE.md,README.md,docs/configuration.md— documentation updatesTest plan
pytest) — all tests should passVOICE_PROVIDER=localwith whisper.cpp installed transcribes a real voice messageVOICE_PROVIDER=mistralandVOICE_PROVIDER=openaistill work unchanged🤖 Generated with Claude Code