This project provides tools to transcribe podcasts to text using speech-to-text technology and generate summaries of the content. It builds on existing YouTube summarization functionality but extends it to work with podcast audio.
- Python 3.6+
- CUDA-compatible GPU (optional, but recommended for faster transcription)
- FFmpeg (required for audio processing)
- All Python dependencies are listed in requirements.txt
- Clone this repository
- Create a Python virtual environment (recommended):
python3 -m venv whisper_env source whisper_env/bin/activate # On Windows: whisper_env\Scripts\activate
- Install all dependencies:
pip install -r requirements.txt
The installation above includes PyTorch with CUDA support, which will automatically use your GPU(s) for Whisper transcription if available. This can speed up transcription by 5-10x compared to CPU-only processing.
To verify GPU support is working:
python -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('GPU count:', torch.cuda.device_count())"This should output CUDA available: True if your GPU is properly detected.
A bash script that downloads YouTube subtitles and generates a summary using the Fabric tool.
# Basic usage
./yt-summerize.sh <youtube-url>
# Example
./yt-summerize.sh https://www.youtube.com/watch?v=wYb3Wimn01sThe script will:
- Download subtitles from the provided YouTube URL
- Clean the subtitles (remove timestamps and indices)
- Use Fabric to generate a summary
- Display the summary and save it to summary.txt
A script that uses the Fabric tool to process YouTube videos and generate summaries in markdown format.
# Basic usage
./fabric.sh <youtube-url>
# Example
./fabric.sh https://www.youtube.com/watch?v=wYb3Wimn01sSimilar to fabric.sh but uses the Llama 3 8B model for summarization.
# Basic usage
./fabric2.sh <youtube-url>
# Example
./fabric2.sh https://www.youtube.com/watch?v=wYb3Wimn01sA bash-only script that downloads YouTube subtitles and generates a summary without requiring external tools.
# Basic usage
./ytr.sh <youtube-url>
# Example
./ytr.sh https://www.youtube.com/watch?v=wYb3Wimn01sA simple bash script that downloads podcast audio, transcribes it, and generates a summary.
# Basic usage
./podcast-transcribe.sh <podcast-url>
# Example
./podcast-transcribe.sh https://example.com/podcast-episodeThe script will:
- Download the audio from the provided URL
- Transcribe the audio using Whisper (base model)
- Generate a summary using one of three methods:
- Fabric (if available)
- Python summarizer
- Simple bash-based summarization
All files are saved in the podcast_output directory.
A more flexible Python script with additional options.
# Basic usage
./podcast_transcriber.py <podcast-url>
# With options
./podcast_transcriber.py <podcast-url> --output-dir custom_output --model medium --sentences 15Options:
--output-dir: Custom output directory (default: podcast_output)--model: Whisper model size (tiny, base, small, medium, large)--sentences: Number of sentences in the summary (default: 10)
The Python script provides:
- Better error handling
- More customization options
- Metadata tracking
- Higher quality transcription with larger models
Both scripts generate the following files:
podcast_audio.mp3: The downloaded audio filetranscript.txt: The full transcript of the podcastsummary.txt: A summary of the podcast content
The Python script also generates:
metadata.json: Information about the processing
- Download: Uses yt-dlp to download audio from the provided URL
- Transcription: Uses OpenAI's Whisper model to convert speech to text
- Summarization: Uses extractive summarization to identify the most important sentences
- If you encounter issues with downloading, make sure yt-dlp is installed and up to date
- For transcription issues, try using a smaller Whisper model (tiny or base)
- If summarization produces poor results, try adjusting the number of sentences
- If Whisper is running slowly, check GPU usage with
nvidia-smito ensure it's utilizing your GPU - If your GPU is not being used:
- Verify PyTorch is installed with CUDA support:
python -c "import torch; print(torch.cuda.is_available())" - Ensure you have enough free GPU memory
- Try reinstalling PyTorch with the correct CUDA version for your system from https://pytorch.org/get-started/locally/
- Verify PyTorch is installed with CUDA support:
- If you're using the Fabric-based scripts (
yt-summerize.sh,fabric.sh, orfabric2.sh), make sure you have the Fabric CLI tool installed - You can install Fabric using:
pip install fabric-cli
- For more information about Fabric, visit the official documentation
MIT
For testing purposes, we've provided a list of AI-focused podcast episodes in the ai_podcast_samples.md file. These samples include YouTube links to Lex Fridman Podcast episodes and TED Talks on AI that can be used to test the transcription and summarization tools.
Example:
# Using the bash script with a short TED Talk (recommended for initial testing)
./podcast-transcribe.sh https://www.youtube.com/watch?v=wYb3Wimn01s
# Using the Python script with a medium-sized model
./podcast_transcriber.py https://www.youtube.com/watch?v=wYb3Wimn01s --model mediumThe sample podcasts include both short talks (5-15 minutes) for quick testing and longer episodes (1+ hours) for more comprehensive testing.