PodCite

A real-time podcast analysis tool that automatically transcribes audio and extracts notable context, claims, and references as you listen. The application processes podcast content to identify significant statements for multi-source verification. Additionally, users can manually select specific transcript segments to trigger in-depth research workflows.

Demo

Link

Features

RSS Feed Processing - Parse podcast feeds and download episodes
Real-time Transcription - OpenAI Whisper-based speech-to-text with timestamps
Automated Context Extraction - AI automatically identifies notable statements and starts research
Multi-Source Research - Verifies claims using arXiv, Brave Search, and Congress.gov
Manual Research - Select any text to trigger custom research

Tech Stack

Frontend: Next.js, TypeScript, Tailwind CSS
Backend: FastAPI, Python 3.11+
AI: LangGraph, xAI API (grok-4-fast), OpenAI (Whisper)
Search APIs: Brave Search, arXiv, Congress.gov

Prerequisites

Python 3.11+
Node.js 18+
uv - Python package manager (install)
API Keys (see Configuration below)

Installation

Backend

cd backend
uv sync

Frontend

cd frontend
npm install

Configuration

Create a .env file in the /backend directory:

# Required
OPENAI_API_KEY=...          # Whisper transcription
XAI_API_KEY=...             # Grok-4-fast for AI workflows
BRAVE_API_KEY=...           # Web search
CONGRESS_API_KEY=...        # Congress.gov API

# Optional
LANGSMITH_API_KEY=...       # LangSmith workflow debugging

Running the Application

Start both servers in separate terminals:

Backend (port 8000):

cd backend
uv run main.py

Frontend (port 3000):

cd frontend
npm run dev

Usage

Enter RSS Feed URL - Paste any podcast RSS feed URL
Episode Loads - Most recent episode loads automatically with audio
Start Playback - Click play to begin transcription
View Research - Research results appear in the right panel as the podcast plays
Manual Research - Select any transcript text to trigger custom research

How It Works

Transcription Pipeline

Audio is split into 120-second chunks
First 2 chunks transcribe immediately on play
Additional chunks load progressively as you listen
Seeking ahead triggers transcription of that section

Research Workflow

Every 25 seconds of playback, the system analyzes the previous 20-second window
AI extracts notable statements (claims, statistics, references)
Each statement is researched using LangGraph workflows
Results include source verification, confidence scores, and URLs

Project Structure

├── backend/
│   ├── app/
│   │   ├── api/v1/endpoints/     # API routes
│   │   ├── services/             # Business logic
│   │   ├── workflows/            # LangGraph AI workflows
│   │   └── main.py               # FastAPI app
│   └── media/                    # Audio & transcription cache
├── frontend/
│   ├── app/                      # Next.js pages
│   ├── components/               # React components
│   └── lib/                      # API client

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PodCite

Demo

Features

Tech Stack

Prerequisites

Installation

Backend

Frontend

Configuration

Running the Application

Usage

How It Works

Transcription Pipeline

Research Workflow

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PodCite

Demo

Features

Tech Stack

Prerequisites

Installation

Backend

Frontend

Configuration

Running the Application

Usage

How It Works

Transcription Pipeline

Research Workflow

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages