Skip to content

dublin74/stt-tts-loop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

stt-tts-loop

Local speech loop for microphone or file input:

  • record audio
  • transcribe with faster-whisper
  • synthesize with piper
  • play the result locally
  • append per-turn benchmark rows to JSONL

Requirements

  • Python 3.10+
  • a working piper executable
  • a local Piper voice model (.onnx)

faster-whisper does not require a system ffmpeg install.

Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env

Set these in .env:

  • PIPER_BIN=.venv/bin/piper or another valid path to the Piper binary
  • PIPER_MODEL_PATH=model/voice.onnx or /absolute/path/to/voice.onnx

Optional settings:

  • PIPER_CONFIG_PATH
  • SAMPLE_RATE
  • FRAME_DURATION_MS
  • MAX_SECONDS
  • SILENCE_THRESHOLD
  • SILENCE_DURATION_SECONDS
  • AUDIO_INPUT_DEVICE
  • BENCHMARK_JSONL

Run

Start the interactive loop:

python -m stt_tts_loop

Useful commands:

python -m stt_tts_loop --list-devices
python -m stt_tts_loop --input-audio recordings/sample.wav
python -m stt_tts_loop --compare-stt-models base.en small.en
python -m stt_tts_loop --list-voices
python -m stt_tts_loop --download-voice en_US-libritts-high --voice-download-dir model

Example with explicit overrides:

python -m stt_tts_loop \
  --voice-model model/voice.onnx \
  --input-device 0 \
  --frame-duration-ms 100 \
  --max-seconds 10 \
  --silence-threshold 0.01 \
  --silence-duration 1.0 \
  --save-recording recordings

Behavior

  • Press Enter to start a turn.
  • Recording stops on trailing silence or max duration.
  • --compare-stt-models is STT-only and skips TTS playback.
  • MAX_SECONDS is a per-turn capture limit.
  • Benchmark rows are appended to benchmarks/local_runs.jsonl unless overridden.

Troubleshooting

  • No microphone input:
    • grant microphone permission to the terminal app
    • run python -m stt_tts_loop --list-devices
    • retry with --input-device <index>
  • Piper executable not found:
    • install piper-tts in the active environment or point PIPER_BIN to the binary
  • Piper model not found:
    • set PIPER_MODEL_PATH to a valid local .onnx file
  • Empty or near-silent capture:
    • increase mic input level
    • lower SILENCE_THRESHOLD
    • increase MAX_SECONDS

About

Local CLI for benchmarking STT → TTS pipelines using Fast-Whisper and Piper

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages