OBookLLM is a powerful, self-hosted, offline-capable alternative to NotebookLLM. It allows you to upload documents (PDF, Docx, Text, Markdown), audio files, and even YouTube URLs to create interactive "notebooks." You can then chat with your sources using advanced RAG (Retrieval-Augmented Generation) powered by local LLMs (Ollama) or cloud providers (OpenAI, Anthropic, Gemini).
- Multi-Modal Ingestion: Support for PDF, DOCX, TXT, MD, JSON, CSV, Excel, XML, YAML, HTML, and Source Code.
- Audio Transcription: Built-in GPU-accelerated transcription for Audio files (MP3, WAV, etc.) and automatic YouTube video processing using
faster-whisper. - OCR Capabilities: Automatically extracts text from scanned PDFs and Images using Tesseract.
- Interactive Chat: Context-aware chat with your documents using RAG.
- Source Citations: Responses include citations linking back to the exact segment in the source text or audio transcript.
- Podcast Generation: (Coming Soon) Generate conversational audio summaries of your notebooks.
- Local First: Designed to run 100% locally with Ollama and Docker, ensuring privacy.
- Flexible AI Providers: Switch between Ollama, OpenAI, Anthropic, and Google Gemini for chat and embeddings.
- Frontend: SvelteKit, TailwindCSS, TypeScript (Bun runtime).
- Backend: FastAPI, Python 3.10.
- AI/ML:
- LLM Orchestration: LangChain.
- Vector DB: ChromaDB.
- Transcription: Faster Whisper (CTranslate2).
- Local Inference: Ollama.
- Infrastructure: Docker & Docker Compose.
Since OBookLLM processes audio and LLMs locally, hardware requirements depend on usage:
-
Minimum:
- CPU: Valid AVX2 support (Modern Intel/AMD CPUs)
- RAM: 8GB (Running small quantized models, e.g., Llama 3 8B q4_0)
- GPU: None (CPU-only inference is slower but functional)
- Storage: 10GB free space
-
Recommended (For best performance):
- RAM: 16GB+
- GPU: NVIDIA GPU with 8GB+ VRAM (for fast Whisper transcription & LLM inference)
- Storage: SSD
The easiest way to run OBookLLM is with Docker Compose. This starts the Frontend, Backend, Database, Vector DB, and Ollama services.
-
Clone the repository:
git clone https://github.com/yourusername/OBookLLM.git cd OBookLLM -
Run with Docker Compose:
# Ensure you have NVIDIA Container Toolkit installed if you want GPU support docker-compose up --build -d -
Access the App:
- Frontend: http://localhost:3000
- Frontend Documentation
- Backend Documentation
- Hardware Requirements
- Manual Installation
- Configuration
- Roadmap
- Basic RAG (PDF, Text, Markdown)
- Audio Transcription (Faster Whisper)
- YouTube Integration
- OCR Support (Tesseract)
- Docker support
- Podcast Generation (Conversational Audio)
- Web Search Agent (Search Integration)
- Navigate to
backend/:cd backend - Create virtual environment:
python -m venv venv source venv/bin/activate - Install dependencies:
# Requires system deps: ffmpeg, tesseract-ocr, poppler-utils pip install -r requirements.txt - Run Server:
python -m src.main
- Navigate to
frontend/:cd frontend - Install dependencies:
bun install
- Run Dev Server:
bun run dev
Create a .env file in the backend/ directory:
MONGODB_URI=mongodb://localhost:27017
CHROMA_HOST=localhost
CHROMA_PORT=8000
OLLAMA_HOST=http://localhost:11434Create a .env file in the frontend/ directory:
AUTH_SECRET=your-secure-secret-key-here
AUTH_URL=http://localhost:3000
MONGODB_URI=mongodb://localhost:27017
PUBLIC_BACKEND_URL=http://localhost:8008
PORT=3000Note: For production deployments, ensure
AUTH_SECRETis a secure random string. You can generate one with:openssl rand -base64 32
Contributions are welcome! Please feel free to submit a Pull Request.