Extract text and transcribe audio from PowerPoint presentations, MP4 videos, and MP3 files using Whisper.
- Text Extraction: Extracts all text content from PowerPoint slides
- Audio Transcription: Uses faster-whisper to transcribe audio from multiple sources:
- PowerPoint files (.pptx) - embedded audio recordings
- Video files (.mp4) - audio track extraction
- Audio files (.mp3) - direct transcription
- Checkpoint/Resume Support: Automatically saves progress during long transcriptions
- Resume from where you left off if interrupted
- Checkpoints saved every 10 segments
- Safe to stop with Ctrl+C anytime
- Live Progress Tracking: Real-time progress bar and live checkpoint file
- View transcription progress in
output/[filename]_checkpoint.json - See completed segments as they're processed
- Track timestamp and text in real-time
- View transcription progress in
- GPU and CPU Support: Automatic device detection with intelligent fallback
- Multiple Models: Supports various Whisper model sizes (tiny, base, small, medium, large)
- Configurable: Easy-to-modify settings for performance and quality tuning
- Python 3.8 or higher
- ffmpeg (for MP4 video processing)
- CUDA-compatible GPU (optional)
pip install -r requirements.txt# Using chocolatey:
choco install ffmpeg
# Or download from: https://ffmpeg.org/download.htmlbrew install ffmpegsudo apt install ffmpeg- Place your files in the
presentationsfolder:- PowerPoint presentations (.pptx)
- Video files (.mp4)
- Audio files (.mp3)
python main.pyEdit the configuration settings at the top of main.py:
TRANSCRIPTION_ENGINE = "faster-whisper" # Options: "standard", "faster-whisper"PPTX_FOLDER = "presentations" # Input folder
OUTPUT_FOLDER = "output" # Output folderWHISPER_MODEL = "small" # Options: "tiny", "base", "small", "medium", "large"
FORCE_LANGUAGE = "en" # Force language ("en", "es", "fr", etc.) or NoneFORCE_DEVICE = "cpu" # Options: None (auto), "cpu", "cuda"
USE_HALF_PRECISION = False # Enable fp16 for speed boost (GPU only)| Model | Speed | Quality | Memory | Best For |
|---|---|---|---|---|
| tiny | Fastest | Good | ~1GB | Quick drafts, testing |
| base | Fast | Better | ~1GB | General use |
| small | Medium | Good | ~2GB | Recommended - best balance |
| medium | Slow | Very Good | ~5GB | High accuracy needs |
| large | Slowest | Best | ~10GB | Maximum quality |