Skip to content

rayidali/pdf2video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paper to Video

Convert research papers into animated explainer videos suitable for 8th graders.

Overview

This application takes a research paper PDF and converts it into an engaging, animated video explanation using:

  • Mistral OCR for PDF to markdown conversion
  • Claude AI for presentation planning and Manim code generation
  • Manim for 3blue1brown-style animations
  • ElevenLabs for text-to-speech voiceovers
  • Shotstack for final video composition

Setup

  1. Create virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Copy .env.example to .env and fill in your API keys:
cp .env.example .env
  1. Run the server:
uvicorn app.main:app --reload
  1. Open http://localhost:8000 in your browser

Deploy to Render

  1. Go to render.com and sign up/login

  2. Click New +Web Service

  3. Connect your GitHub repo

  4. Configure the service:

    • Build Command: pip install -r requirements.txt
    • Start Command: uvicorn app.main:app --host 0.0.0.0 --port $PORT
  5. Add environment variables in the Render dashboard:

    • MISTRAL_API_KEY - Your Mistral API key
    • ANTHROPIC_API_KEY - Your Anthropic API key
  6. Click Create Web Service

Your app will be live at https://your-app-name.onrender.com

Render advantages over Vercel:

  • Longer timeouts (better for OCR processing)
  • Persistent filesystem during instance lifetime
  • Better suited for Python backends

API Endpoints

  • GET /health - Health check
  • POST /api/upload - Upload a PDF file
  • POST /api/process/{job_id} - Process PDF through OCR
  • GET /api/status/{job_id} - Get job status
  • GET /api/markdown/{job_id} - Get extracted markdown

Usage

  1. Upload a PDF:
curl -X POST -F "file=@paper.pdf" http://localhost:8000/api/upload
  1. Process through OCR:
curl -X POST http://localhost:8000/api/process/{job_id}
  1. Check status:
curl http://localhost:8000/api/status/{job_id}
  1. Get extracted markdown:
curl http://localhost:8000/api/markdown/{job_id}

Project Structure

paper-to-video/
├── app/
│   ├── __init__.py
│   ├── main.py                 # FastAPI application
│   ├── config.py               # Environment variables
│   ├── models/
│   │   └── schemas.py          # Pydantic models
│   ├── services/
│   │   └── ocr_service.py      # Mistral OCR
│   ├── agents/                 # (Phase 2+)
│   ├── prompts/                # (Phase 2+)
│   └── utils/
├── static/
│   ├── index.html              # Frontend UI
│   ├── style.css               # Styles
│   └── app.js                  # Frontend JavaScript
├── tests/
│   └── test_ocr.py
├── render.yaml                 # Render deployment config
├── requirements.txt
├── .env.example
└── .gitignore

License

MIT License

About

AI workflow that takes in a pdf research paper and outputs a video presentation explaining the paper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors