Text-to-speech using KittenTTS.
Requires uv. No global Python packages needed.
uv syncThis installs Python 3.12 (if needed) and all dependencies into a local .venv/.
uv run python main.py tts input.txt
uv run python main.py tts input.txt -o episode1
uv run python main.py tts input.txt -v Luna -o episode1
uv run python main.py tts input.txt -s 1.2 # speak faster
uv run python main.py tts input.txt -s 0.8 # speak slowerOutput is MP3
uv run python main.py new-podcast "My Podcast" -d "A podcast about things"
uv run python main.py new-podcast "My Podcast" -o podcasts/feed.xml
uv run python main.py new-podcast "My Podcast" --force # overwrite existingCreates an RSS 2.0 feed template (feed.xml by default). Edit it to fill in your podcast details (link, image, category, etc.).
Creates an mp3 from the input text file and adds the results to the podcast feed.
uv run python main.py tts 2026-04-06.txt -f feed.xml -t "April 6 News"The -f flag appends the generated audio as a new episode to the feed file. -t sets the episode title (defaults to the output filename).
When using -f:
- MP3 and transcript files are placed in the same directory as the feed
- The episode date is extracted from the input filename (
YYYY-MM-DD.txt) - The episode description is the text before the first
---in the input file - A WebVTT transcript (
.vtt) is generated next to the mp3 with timestamps distributed proportionally (by sentence length) across the audio duration - A
<podcast:transcript>tag (Podcasting 2.0) links to the.vttfile withtype="text/vtt" - Running again for the same date replaces the existing episode
If you already have an mp3 and transcript and just want to add it to the feed without re-running TTS:
uv run python main.py add-episode feed.xml episode.mp3 2026-04-07.txt
uv run python main.py add-episode feed.xml episode.mp3 2026-04-07.txt -t "April 7 News"The mp3 and transcript files are copied into the feed's directory (overwriting if they already exist), and the episode is added/replaced in the feed.
Assumes the rclone tool is installed.
List bucket contents for a bucket called mycast (assumes a rclone remote configured named r2:
rclone ls r2:mycast
Upload entire output directory contents:
rclone copy ./output r2:mycast
| Voice | Gender |
|---|---|
| Bella | Female (default) |
| Jasper | Male |
| Luna | Female |
| Bruno | Male |
| Rosie | Female |
| Hugo | Male |
| Kiki | Female |
| Leo | Male |
- The TTS model (~25MB) is downloaded from HuggingFace on first run and cached at
~/.cache/huggingface/hub/. Subsequent runs use the cache offline — no network requests. - MP3 output is encoded at 320kbps CBR via ffmpeg.