Skip to content

Automatically transcribe Zoom recordings to Google Docs with calendar integration and speaker identification. Zero API costs using local Whisper AI.

Notifications You must be signed in to change notification settings

stephenhsklarew/Zoom2GoogleTranscript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Zoom2GoogleTranscript

Automatically transcribe Zoom mp4 videos to Google Docs using local Whisper AI with intelligent calendar integration and speaker identification. Zero ongoing costs - all processing happens on your machine.

✨ Key Features

  • πŸŽ₯ Batch Processing - Transcribe multiple videos automatically
  • πŸ“… Calendar Integration - Automatically fetches meeting titles and attendees from Google Calendar
  • πŸ‘₯ Speaker Identification - Uses calendar attendees for accurate speaker names
  • πŸ’° Zero Cost - Uses local Whisper AI (no API charges)
  • πŸ“ Google Meet Format - Creates properly formatted transcripts matching Google Meet style
  • πŸ”„ Progress Tracking - Rich terminal UI with progress bars
  • 🎯 Model Selection - Choose speed vs accuracy trade-off
  • πŸ€– AI Speaker Diarization - Optional advanced speaker detection with pyannote.audio

πŸš€ Quick Start

Prerequisites

  1. Python 3.9+

  2. ffmpeg - Required for audio processing

    # macOS
    brew install ffmpeg
    
    # Ubuntu/Debian
    sudo apt install ffmpeg
  3. Google Cloud Project - For API access

Installation

# Clone the repository
git clone https://github.com/stephenhsklarew/Zoom2GoogleTranscript.git
cd Zoom2GoogleTranscript

# Install dependencies
pip install -r requirements.txt

# Authenticate with Google
python authenticate.py

Basic Usage

# Transcribe all videos in a folder
python video_transcriber.py /path/to/zoom/recordings

# Use a specific model
python video_transcriber.py /path/to/zoom/recordings --model medium

# Specify credentials
python video_transcriber.py /path/to/zoom/recordings --credentials token_video.pickle

πŸ“‹ Output Format

The tool creates Google Docs transcripts in Google Meet format:

Dec 9, 2024
Steve/Karan/Stephen - Transcript
Attendees: karan.apatel, stephen.sklarew, steve.burden
00:00:00

karan.apatel: Hey everyone, thanks for joining...
stephen.sklarew: Great to be here. Let's discuss...
steve.burden: I'll start with the quarterly results...

How It Works

  1. Extracts date/time from Zoom folder names (format: YYYY-MM-DD HH.MM.SS Meeting Name)
  2. Queries Google Calendar for matching events (Β±30 minute window)
  3. Extracts meeting details - title and attendee list
  4. Transcribes audio using Whisper AI
  5. Maps speakers to calendar attendees
  6. Creates formatted Google Doc with proper attribution

🎀 Speaker Identification

Method 1: Calendar-Based (Default)

Uses pause detection (>2 seconds) combined with calendar attendee names:

  • βœ… Zero setup required
  • βœ… Works immediately with calendar integration
  • βœ… Good for 2-3 person conversations
  • ⚠️ Less accurate for complex multi-speaker scenarios

Method 2: AI-Powered Diarization (Optional)

For advanced speaker detection with pyannote.audio:

  1. Get Hugging Face Token:

  2. Use with token:

    # Via environment variable (recommended)
    export HF_TOKEN=hf_your_token_here
    python video_transcriber.py /path/to/videos
    
    # Or via command line
    python video_transcriber.py /path/to/videos --hf-token hf_your_token_here

Benefits of AI Diarization:

  • 🎯 More accurate speaker detection
  • πŸ‘₯ Better for 3+ person meetings
  • πŸ”Š Analyzes voice characteristics, not just pauses
  • βœ… Still free (runs locally)

πŸŽ›οΈ Command Line Options

python video_transcriber.py <video_folder> [OPTIONS]

Required:
  video_folder              Path to folder containing Zoom recordings

Optional:
  --model MODEL            Whisper model: tiny, base, small, medium, large
                           (default: base)

  --no-recursive           Don't search subdirectories

  --folder-id ID           Google Drive folder ID to save documents

  --credentials PATH       Path to Google credentials file
                           (default: token_video.pickle)

  --hf-token TOKEN         Hugging Face token for speaker diarization
                           (can also use HF_TOKEN environment variable)

  --since DATE             Only process videos modified after this date
                           Format: YYYY-MM-DD or YYYY-MM-DD HH:MM:SS
                           Examples: 2024-12-01 or "2024-12-01 14:30:00"

πŸ“Š Model Comparison

Model Speed Accuracy RAM Download Size
tiny ~32x realtime Lowest 1GB ~75MB
base ~16x realtime Good βœ… 1GB ~140MB
small ~6x realtime Better 2GB ~460MB
medium ~2x realtime High 5GB ~1.5GB
large ~1x realtime Best 10GB ~3GB

Recommendation: Start with base model for speed/quality balance.

Real-World Performance

MacBook Pro M1 (CPU only):

  • 30 min video with base model: ~2 minutes
  • 30 min video with medium model: ~4 minutes

πŸ”§ Setup Details

1. Google Cloud Setup

  1. Go to https://console.cloud.google.com
  2. Create a new project
  3. Enable these APIs:
    • Google Docs API
    • Google Drive API
    • Google Calendar API (v3)
  4. Create OAuth 2.0 credentials:
    • Application type: "Desktop app"
    • Download as credentials.json
    • Place in project directory

2. Authentication

Run the authentication script once:

python authenticate.py

This will:

  • Open your browser for Google OAuth
  • Request permissions for Docs, Drive, and Calendar
  • Save credentials to token_video.pickle

The token is reused for all future transcriptions.

3. Zoom Recording Structure

For calendar integration to work, organize recordings in Zoom's default format:

Zoom/
β”œβ”€β”€ 2024-12-09 10.31.25 Steve_Karan_Stephen/
β”‚   └── video1487928882.mp4
β”œβ”€β”€ 2024-12-02 15.00.18 Diane_Stephen Weekly 1_1/
β”‚   └── video1683623283.mp4
└── ...

The folder name format YYYY-MM-DD HH.MM.SS Meeting Name is used to match calendar events.

πŸ’‘ Usage Examples

Example 1: Weekly Meeting Transcripts

# Transcribe all recordings from last week
python video_transcriber.py ~/Documents/Zoom --model base

# Review transcripts in Google Docs
# Speaker names automatically pulled from calendar

Example 2: Client Call Archive

# Process all client recordings with better accuracy
python video_transcriber.py ~/Videos/ClientCalls \
  --model medium \
  --folder-id abc123xyz

# All transcripts organized in specific Drive folder

Example 3: Conference Recording

# Use AI speaker diarization for multi-speaker panel
export HF_TOKEN=hf_your_token
python video_transcriber.py ~/Conferences/2024 \
  --model medium \
  --recursive

Example 4: Incremental Processing

# Only process videos from this week
python video_transcriber.py ~/Documents/Zoom --since 2024-12-01

# Process videos from a specific date and time
python video_transcriber.py ~/Documents/Zoom --since "2024-12-01 14:30:00"

# Useful for daily/weekly automation - only transcribe new recordings
python video_transcriber.py ~/Documents/Zoom --since $(date -v-7d +%Y-%m-%d)

πŸ”’ Security & Privacy

  • βœ… All AI processing is local - Videos never sent to external servers
  • βœ… No OpenAI API calls - Zero data sent to cloud
  • βœ… Google OAuth - Secure authentication flow
  • βœ… Minimal permissions - Only Docs/Drive/Calendar access
  • βœ… Token stored locally - credentials.json and token.pickle stay on your machine

πŸ› Troubleshooting

"ffmpeg not found"

brew install ffmpeg  # macOS
sudo apt install ffmpeg  # Ubuntu/Debian

"Calendar API has not been used"

Enable Calendar API in Google Cloud Console: https://console.cloud.google.com/apis/library/calendar-json.googleapis.com

"No calendar event found"

Check that:

  • Video folder follows Zoom naming: YYYY-MM-DD HH.MM.SS Meeting Name
  • Calendar event exists within Β±30 minutes of recording time
  • Calendar API is enabled and authenticated

Slow processing

  • Use smaller model (--model base or --model tiny)
  • Enable GPU if available (automatic)
  • Process overnight for large batches

Incorrect speaker names

  • Verify calendar event has attendees listed
  • Try AI diarization with --hf-token for better accuracy
  • Check that Zoom folder timestamp matches meeting time

πŸ“ Cost Comparison

Solution Cost Processing Speed Accuracy
This Tool (Zoom2GoogleTranscript) $0 Local (2-10x realtime) High
Whisper API $0.006/min Very Fast High
Google Speech-to-Text $0.016/min Very Fast Medium
Rev.ai $1.50/min Fast Very High

100 hours of video:

  • Zoom2GoogleTranscript: $0
  • Whisper API: $36
  • Google Speech-to-Text: $96
  • Rev.ai: $9,000

🀝 Contributing

Contributions welcome! Areas for improvement:

  • Support for additional video formats (mov, avi, webm)
  • Parallel processing for faster batch jobs
  • Custom speaker name mapping
  • Integration with other calendar systems
  • Improved speaker diarization algorithms

πŸ“„ License

MIT License - Free for personal and commercial use.

πŸ‘€ Author

Stephen Sklarew (@stephenhsklarew)

πŸ™ Acknowledgments

πŸ“ž Support

For issues or questions:

About

Automatically transcribe Zoom recordings to Google Docs with calendar integration and speaker identification. Zero API costs using local Whisper AI.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages