Query My Pod

AI-Powered Podcast Search

A self-hosted web application that makes your favorite podcasts searchable and queryable through AI. Track podcasts via RSS, automatically transcribe episodes, and use semantic search with local LLMs to discover content across your entire podcast library.

⚠️ Project Status

Production Ready! All core features are implemented and working. The application provides full RAG-powered semantic search with interactive transcript playback.

Features

✅ Implemented

Podcast Management

RSS Feed Management: Subscribe to podcasts via RSS with automatic episode discovery
Episode Import: Extracts metadata, cover art, and audio following PSP-1 spec
Daily Refresh: Automatically checks for new episodes
Background Jobs: Solid Queue handles imports, transcription, and scheduled tasks

Transcription & Processing

Automatic Transcription: Self-hosted Whisper generates timestamped JSON transcripts
Transcript Chunking: Automatically segments transcripts into searchable chunks
Vector Embeddings: Generates semantic embeddings using sentence-transformers
Ad Detection: LLM-powered advertisement detection with confidence scores
Flexible Pipeline: Composable processing steps (download → transcribe → chunk → detect ads → embed)

AI-Powered Search

Semantic Search: Vector similarity search using SQLite with neighbor gem
RAG (Retrieval Augmented Generation): LLM generates answers with citations
Local LLM Integration: Works with Ollama (qwen2.5:7b, llama3.2, etc.)
LLM Tool Calling: AI can request additional context when needed (up to 3 iterations)
Weighted Search Results: Title matches boosted 3x, descriptions 2x for better relevance
Context-Aware Search: Search within episode, podcast, or across all podcasts
Inline Search Results: Turbo-powered search with loading indicators
CLI Query Tool: Command-line interface for querying transcripts

Interactive Transcript Player

Live Transcript Highlighting: Synchronized with audio playback
Click-to-Seek: Click any transcript chunk to jump to that timestamp
Search Result Navigation: Click search results to jump directly to relevant audio
HTTP Range Requests: Efficient audio seeking without buffering entire file
Dual Highlight Modes: Different styles for navigation vs playback tracking
Smart Scrolling: Auto-scroll transcript without interfering with page navigation

User Interface

Web Interface: Modern Bootstrap 5 UI with Hotwire/Turbo
Audio Player: HTML5 player with timestamp deep linking
Search UI: Contextual search boxes embedded in relevant pages
Responsive Design: Works on desktop and mobile

📋 Potential Future Enhancements

Advanced Search Features: Filter by date, podcast, keywords
Search History: Track and revisit previous searches
Saved Searches: Bookmark frequently used queries
Episode Cross-Linking: Discover related episodes and topics
Speaker Diarization: Identify different speakers in episodes
Multi-Language Support: Transcribe and search non-English podcasts
Export Features: Export transcripts, search results, or citations
API Access: RESTful API for programmatic access
Docker Deployment: One-command deployment with Docker Compose

Use Cases

Find specific discussions or topics across hundreds of episodes
Research what multiple podcasts have said about a particular subject
Build a personal, searchable podcast knowledge base
Discover connections between episodes and topics
Jump directly to relevant moments in podcast episodes
Follow along with transcripts synchronized to audio playback

Tech Stack

Backend: Ruby on Rails 8
Database: SQLite with vector search (neighbor + sqlite-vec)
Background Jobs: Solid Queue
Transcription: Whisper (self-hosted)
Embeddings: sentence-transformers (all-MiniLM-L6-v2)
LLM: Ollama (qwen2.5:7b, llama3.2, or similar)
Vector Search: neighbor gem with cosine similarity
UI: Bootstrap 5 with Hotwire/Turbo + Stimulus
Audio: HTML5 with HTTP range request support

Prerequisites

Ruby 3.4+
Rails 8
Python 3.8+ with virtual environment
Whisper: pip install openai-whisper
Ollama: For LLM functionality
Sufficient storage for podcast audio files, transcripts, and embeddings
GPU recommended (but not required) for faster transcription

Installation

# Clone the repository
git clone git@github.com:bradleesand/query-my-pod.git
cd query-my-pod

# Install Ruby dependencies
bundle install

# Setup database
rails db:setup

# Create Python virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install Python dependencies
pip install openai-whisper sentence-transformers torch

# Install and start Ollama
brew install ollama  # On macOS
ollama serve &
ollama pull qwen2.5:7b  # Or llama3.2, gemma3:12b, etc.

# Run the application (with background jobs)
bin/dev

Configuration

Environment Variables

Copy .env.example to .env and configure:

# Episode Processing Configuration
AUTO_TRANSCRIBE=false          # Auto-transcribe new episodes
ENABLE_TRANSCRIPTION=true      # Enable transcription feature
AUTO_DOWNLOAD_AUDIO=false      # Auto-download audio for new episodes
DOWNLOAD_AUDIO=false           # Keep audio files after transcription

# RAG Search Configuration
ENABLE_SEMANTIC_SEARCH=true    # Enable vector search and LLM features
PYTHON_PATH=venv/bin/python3   # Path to Python in virtual environment
OLLAMA_API_URL=http://localhost:11434
OLLAMA_MODEL=qwen2.5:7b        # LLM model (qwen2.5:7b recommended)
SEARCH_CONTEXT_CHUNKS=5        # Number of chunks for context

# Ad Detection Configuration
ENABLE_AD_DETECTION=false      # Enable automatic ad detection
AD_DETECTION_THRESHOLD=0.7     # Confidence threshold (0.0-1.0) for marking as ad

Recommended Configuration for Full Features:

AUTO_TRANSCRIBE=true
ENABLE_SEMANTIC_SEARCH=true
DOWNLOAD_AUDIO=true  # If you want to keep audio files

OpenSSL 3.6.0 Compatibility

If you encounter SSL certificate verification errors with OpenSSL 3.6.0, the openssl gem is included in the Gemfile.

Recurring Jobs

Daily podcast refresh is configured in config/recurring.yml and runs at 2am by default. Edit this file to change the schedule.

Usage

Getting Started

Import a podcast: Click "Import New Podcast" and paste an RSS feed URL
View podcasts: Browse your podcast library on the home page
View episodes: Click a podcast to see all episodes
Process episodes:
- Automatic: Set AUTO_TRANSCRIBE=true for new episodes
- Manual: Use "Download Audio" and "Transcribe" buttons on episode pages

Using Search

Search from anywhere:
- Podcast index: Search across all podcasts
- Podcast page: Search within that podcast
- Episode page: Search within that episode
View results:
- AI-generated answer with citations
- Source excerpts with similarity scores
- Clickable links to episodes with timestamp
Interactive transcript:
- Click search results to jump to exact moment
- Click transcript chunks to seek audio
- Watch live highlighting as audio plays

Processing Existing Episodes

If you have existing episodes and want to enable search:

# Chunk and generate embeddings for all transcribed episodes
rails transcripts:chunk
rails transcripts:generate_embeddings

# Or process a single episode
rails runner "EpisodeProcessingJob.perform_now(episode_id, [:chunk_transcript, :generate_embeddings])"

See the Rake Tasks section below for more batch processing options.

Ad Detection

Detect advertisements in transcripts using LLM analysis:

# Detect ads in all transcribed episodes
rails ads:detect_all

# Detect ads in a specific episode
rails ads:detect_episode[EPISODE_ID]

# Review detected advertisements
rails ads:review

# Show detection statistics by podcast
rails ads:stats

# Reset ad detection for an episode
rails ads:reset_episode[EPISODE_ID]

How it works:

Uses your local LLM (Ollama) to analyze transcript chunks
Identifies sponsor mentions, promo codes, and promotional content
Stores confidence scores (0.0-1.0) for each chunk
Advertisements are excluded from search results by default
Ad chunks are styled differently in the transcript view (gray, italic)

CLI Query Tool

Query your podcast transcripts directly from the command line:

# Basic query across all podcasts
rails runner scripts/query_llm.rb "What are some productivity tips?"

# Query with verbose output (shows all sources and similarity scores)
rails runner scripts/query_llm.rb "What productivity apps were mentioned?" --verbose

# Query specific podcast
rails runner scripts/query_llm.rb "What did they say about focus?" --context podcast --podcast 1

# Query specific episode
rails runner scripts/query_llm.rb "What was the main topic?" --context episode --episode 123

# Filter by listened status
rails runner scripts/query_llm.rb "What are the main themes?" --filter unlistened

# Adjust context chunks (more context = better answers, slower response)
rails runner scripts/query_llm.rb "Tell me about the guest" --limit 15

# Combined options
rails runner scripts/query_llm.rb "What tools were recommended?" \
  --context podcast \
  --podcast 1 \
  --filter listened \
  --limit 12 \
  --verbose

Available Options:

-c, --context CONTEXT - Search context: all, podcast, episode (default: all)
-p, --podcast ID - Podcast ID (required for podcast/episode context)
-e, --episode ID - Episode ID (required for episode context)
-l, --limit N - Number of initial context chunks (default: 10)
-f, --filter FILTER - Listened filter: all, listened, unlistened
-v, --verbose - Show detailed sources with similarity scores
-h, --help - Show help message

How it works:

Performs semantic vector search across transcript chunks
LLM can automatically request additional context via tool calling (up to 3 iterations)
Returns AI-generated answer with numbered citations
Verbose mode shows all sources with episode info, timestamps, and similarity scores
Respects same filtering and scoping as web search

Rake Tasks

Transcript Management

Chunk transcripts into searchable segments:

rails transcripts:chunk

Processes all episodes with completed transcripts and splits them into searchable chunks. Creates one TranscriptChunk per Whisper segment with text and timestamps.

Generate embeddings for semantic search:

rails transcripts:generate_embeddings

Generates 384-dimensional vector embeddings for all transcript chunks that don't have embeddings yet. Uses sentence-transformers (all-MiniLM-L6-v2 model) via Python. Required for semantic search functionality.

Advertisement Detection

Detect ads in all episodes:

rails ads:detect_all

Analyzes all transcribed episodes to detect advertisements using your local LLM. Skips episodes that have already been analyzed. Shows progress and summary statistics.

Detect ads in a specific episode:

rails ads:detect_episode[EPISODE_ID]

Analyzes a single episode and displays detected advertisement chunks with confidence scores and timestamps.

Example: rails ads:detect_episode[42]

Review detected advertisements:

rails ads:review

Shows a detailed review of all detected advertisements across all episodes, grouped by episode with timestamps and confidence scores.

Show detection statistics:

rails ads:stats

Displays advertisement detection statistics including:

Overall ad detection rate
Analysis coverage (analyzed vs unanalyzed chunks)
Per-podcast statistics and ad rates

Reset ad detection for an episode:

rails ads:reset_episode[EPISODE_ID]

Clears all advertisement detection data for a specific episode, allowing you to re-run detection with different settings or after adjusting the LLM model.

Example: rails ads:reset_episode[42]

Model Annotations

Annotate models with schema info:

rails annotate_models

Adds schema information as comments to model files and fixtures. Automatically runs after migrations.

Remove annotations:

rails remove_annotation

Removes schema annotation comments from model and fixture files.

Architecture

Processing Pipeline

Episodes are processed through a flexible pipeline system via EpisodeProcessingJob. The job accepts a list of steps to execute in sequence:

Available Steps:

:download - Download audio to local storage
:trim_ads - Remove ads using audio cue detection (experimental, not yet implemented)
:transcribe - Generate transcript with Whisper
:chunk_transcript - Split transcript into searchable segments
:detect_ads_in_transcript - Detect advertisements using LLM analysis
:generate_embeddings - Create vector embeddings for semantic search

Example Pipelines:

# Full RAG pipeline (when ENABLE_SEMANTIC_SEARCH=true)
EpisodeProcessingJob.perform_later(episode_id, [:download, :transcribe, :chunk_transcript, :generate_embeddings])

# Just transcription
EpisodeProcessingJob.perform_later(episode_id, [:transcribe])

# Add search to existing transcript
EpisodeProcessingJob.perform_later(episode_id, [:chunk_transcript, :generate_embeddings])

# Transcribe without search features
EpisodeProcessingJob.perform_later(episode_id, [:transcribe])

Automatic Pipeline (when AUTO_TRANSCRIBE=true and ENABLE_SEMANTIC_SEARCH=true):

[:download, :transcribe, :chunk_transcript, :generate_embeddings]

RAG Architecture

Data Flow:

Transcription: Whisper generates timestamped transcripts
Chunking: TranscriptChunkingService splits by Whisper segments and creates title/description chunks
Embedding: Python sentence-transformers creates 384-dim vectors
Storage: SQLite stores chunks with vector embeddings
Search: neighbor gem performs cosine similarity search with weighted ranking
LLM: Ollama generates cited responses using top chunks
Tool Calling (optional): LLM can request additional context via search_transcript tool (up to 3 iterations)

Key Features:

Weighted Ranking: Title chunks boosted 3x, description chunks 2x, transcript chunks 1x
Tool Calling: LLM can autonomously gather more context when initial results are insufficient
Iterative Refinement: Up to 3 tool call iterations prevent infinite loops
Metadata Search: Episode titles and descriptions included as searchable chunks

Key Components:

TranscriptChunk model: Stores text chunks with embeddings, timestamps, and chunk type (title/description/transcript/advertisement)
TranscriptSearchService: Performs vector similarity search with weighted re-ranking
EmbeddingService: Wraps Python script for embedding generation
LlmQueryService: Queries Ollama with context, supports tool calling for iterative search
SearchController: Handles search UI and Turbo Frame rendering

Service Architecture

Processing Pipeline:

EpisodeProcessingJob: Orchestrates multi-step processing
TranscriptChunkingService: Chunks transcripts into segments
EmbeddingService: Generates vector embeddings

Core Services:

PodcastImportService: Handles initial RSS import
PodcastRssSyncService: Refreshes podcasts with new episodes
EpisodeAudioDownloadService: Downloads audio files
EpisodeTranscriptionService: Orchestrates transcription
TranscriptSearchService: Semantic search across chunks
LlmQueryService: LLM response generation

Background Jobs:

EpisodeProcessingJob: Pipeline orchestrator
PodcastRefreshJob: Refreshes a single podcast
RefreshAllPodcastsJob: Daily refresh of all podcasts

Development Roadmap

Performance Considerations

Storage Requirements

Audio: ~50-100MB per hour (if DOWNLOAD_AUDIO=true)
Transcripts: ~5-10KB per hour (JSON format)
Embeddings: ~1.5KB per chunk × ~60 chunks/hour = ~90KB per hour
Total: ~50-100MB per hour with audio, ~100KB without

Processing Time (CPU-based, no GPU)

Transcription: ~10-15 minutes per hour of audio
Chunking: Seconds
Embeddings: ~1-2 seconds per chunk (first run slower due to model loading)
Search: Milliseconds for vector search, 2-5 seconds for LLM response

Optimization Tips

Use GPU for faster transcription (3-5x speedup)
Batch embedding generation for new podcasts
Increase SEARCH_CONTEXT_CHUNKS for better context (slower LLM response)
Use faster LLM models like llama3.2 (less accurate but 2x faster)

Contributing

This is a personal project in active development. Contributions, ideas, and feedback are welcome!

License

[License TBD]

Acknowledgments

Whisper - Speech recognition
Ollama - Local LLM hosting
sentence-transformers - Text embeddings
neighbor - Vector similarity search
PSP-1 Podcast RSS Specification

Self-hosted. Private. Searchable.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.github		.github
.kamal		.kamal
app		app
bin		bin
config		config
db		db
docs		docs
lib/tasks		lib/tasks
log		log
notes		notes
public		public
script		script
scripts		scripts
storage		storage
test		test
tmp		tmp
vendor		vendor
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
Procfile.dev		Procfile.dev
README.md		README.md
Rakefile		Rakefile
config.ru		config.ru

bradleesand/query-my-pod

Folders and files

Latest commit

History

Repository files navigation