Skip to content

dakshp26/PDFDashboardWithMCP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF Dashboard With MCP

Upload PDFs, extract their text via GLM-OCR(Ollama)/PyMuPDF, and chat with the document using a local LLM. Everything runs on your machine through Ollama — no API keys or internet connection required.

Features

  • PDF extraction — fast text-layer extraction via PyMuPDF; automatic GLM-OCR fallback for scanned/image-based PDFs
  • Per-document RAG — each uploaded PDF gets its own Chroma vector collection
  • Local LLM chat — agentic Q&A with inline citations powered by Ollama; pick any installed Ollama model from the dropdown
  • Markdown viewer — browse extracted text, preview chunks, and download the markdown

Prerequisites

  • Python 3.11+
  • uv — Python package manager
  • Ollama — local LLM runtime

Setup

1. Clone the repository

git clone https://github.com/dakshp26/PDFDashboardWithMCP.git
cd PDFDashboardWithMCP

2. Install dependencies

uv sync

3. Pull Ollama models

ollama pull qwen2.5:3b       # chat / agent (or any other chat model)
ollama pull nomic-embed-text # embeddings
ollama pull glm-ocr          # OCR fallback (scanned PDFs)

4. Run the app

uv run streamlit run app/main.py

Open http://localhost:8501 in your browser.

Usage

  1. Upload PDF — go to the Upload PDF page, select a PDF, and wait for the extraction pipeline to finish
  2. Chat — switch to the Chat page, pick your PDF and any installed Ollama model from the dropdowns, and ask questions

Project Structure

app/
├── main.py                       # Entry point, page navigation
├── app_pages/
│   ├── landing.py                # Home / welcome page
│   ├── process_pdf_upload.py     # Upload + pipeline UI
│   ├── pdf_library.py            # Browse uploaded PDFs (read-only viewer)
│   └── process_pdf.py            # Viewer + chat UI
└── process_pdf/
    ├── extract.py                 # PDF → Markdown (pymupdf4llm + GLM-OCR)
    ├── pipeline.py                # Extraction pipeline with live progress
    ├── rag.py                     # Chunking, embeddings, Chroma persistence
    └── agent.py                   # LangChain agent with retriever tool
mcp_server/
└── server.py                     # MCP server (list_documents, get_document)
data/                             # Runtime data (gitignored)
├── process_pdf/                  # Saved PDFs and extracted markdown
└── process_chroma/               # Chroma vector collections (one per PDF)

Note

For a detailed breakdown of every file, execution order, and data flow, see APP_STRUCTURE.md.

Streamlit Dashboard

The app is a three-page Streamlit dashboard:

Page Description
Home Welcome page with a quick-start overview
Upload PDF Select a PDF, watch the extraction pipeline run in real time (text layer → OCR fallback → chunking → embedding), then download the extracted markdown
PDF Library Browse all previously uploaded PDFs; view extracted markdown and chunk previews without re-running the pipeline
Chat Pick an indexed PDF and any installed Ollama model, ask questions, and get answers with inline source citations

The pipeline progress is shown live inside an st.status block. After a PDF is processed its vector collection persists in data/process_chroma/, so the next session loads instantly without re-running the pipeline.

MCP Server

The included MCP server exposes the vector store to any MCP-compatible client (Claude Desktop, Cursor, etc.) with two tools:

  • list_documents — returns all indexed document collections
  • get_document(document, query) — searches a collection and returns relevant chunks
Claude Desktop

Add to claude_desktop_config.json (usually %APPDATA%\Claude\claude_desktop_config.json on Windows), or use .mcp.json in the project root to keep it project-scoped:

{
  "mcpServers": {
    "PDFDashboardWithMCP": {
      "command": "uv",
      "args": ["run", "--directory", "/absolute/path/to/PDFDashboardWithMCP", "mcp_server/server.py"]
    }
  }
}
Cursor

Add to .cursor/mcp.json in your project root (or the global ~/.cursor/mcp.json):

{
  "mcpServers": {
    "PDFDashboardWithMCP": {
      "command": "uv",
      "args": ["run", "--directory", "/absolute/path/to/PDFDashboardWithMCP", "mcp_server/server.py"]
    }
  }
}
Claude Code

The recommended approach is a project-scoped .mcp.json in the repository root so the server is only active for this project and doesn't pollute your global config:

{
  "mcpServers": {
    "PDFDashboardWithMCP": {
      "command": "uv",
      "args": ["run", "--directory", "/absolute/path/to/PDFDashboardWithMCP", "mcp_server/server.py"]
    }
  }
}

Claude Code picks up .mcp.json automatically when you open the project. No extra setup needed.

Replace /absolute/path/to/PDFDashboardWithMCP with the absolute path to your cloned repository.

Ollama must be running with nomic-embed-text pulled for the MCP server to load collections.

Tech Stack

Component Library
UI Streamlit
PDF extraction langchain-pymupdf4llm, PyMuPDF
OCR fallback Ollama glm-ocr
Embeddings Ollama nomic-embed-text
Vector store Chroma (langchain-chroma)
LLM / agent Any Ollama chat model (e.g. qwen2.5:3b), LangChain
Package manager uv
MCP server mcp[cli]

About

Upload PDFs, extract text via OCR/PyMuPDF RAG Pipeline, and chat with your documents using a local LLM — fully offline via Ollama, with a Streamlit dashboard and MCP server for AI assistant integration.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages