Grounded, confidence-gated IT helpdesk chatbot built with a production-style Retrieval-Augmented Generation (RAG) pipeline. Upload IT PDFs (network, VPN, SSL, Linux/cloud guides, device manuals, etc.) and ask questions with sources and safe refusals when evidence is insufficient.
- Not deployed yet. Run locally using the steps below.
- 📄 PDF Ingestion (digital text PDFs)
- 🖼️ Scanned / Image PDF Support
- Auto-detects
text_pdf,image_pdf, andmixed_pdf - Uses OCR for image pages (local Tesseract; optional Vision fallback)
- Auto-detects
- 🔎 Hybrid Retrieval
- Dense embeddings + sparse BM25 vectors for stronger candidate recall
- ⚖️ Evidence & Confidence Gates
- Retrieval quality checks and confidence-based refusal to reduce hallucinations
- 🧠 Reranking + Fallbacks
- Cohere reranking when available, with local fallback on failures
- 🔄 Provider Failover
- Automatic LLM provider rotation across configured keys
- 📡 Streaming APIs (SSE)
/query/streamand/chat/streamemitdeltathenfinal
| Layer | Technology |
|---|---|
| Backend | Python 3.11+, FastAPI |
| Ingestion / Parsing | PyMuPDF + OCR (Tesseract; optional Vision) |
| Text Chunking | langchain-text-splitters |
| Embeddings | OpenAI / OpenRouter / Cohere (configurable) |
| Vector DB | Qdrant or Milvus |
| Relational DB | Postgres or MySQL |
| Retrieval | Dense + BM25 sparse hybrid search |
This repo includes a full reference manual under docs/:
docs/reference/MANUAL.md(implementation-level behavior, accuracy/grounding strategy, API contracts)docs/reference/sampleqna.md(sample Q&A content used for testing)docs/reference/helpdesk_flow.png(flow image)docs/diagram-v4.gif(end-to-end architecture animation)docs/diagram-v4.html(source for the architecture diagram)
git clone https://github.com/Jayesh12356/RAG_Engine.git
cd RAG_EngineCopy env:
cp .env.example .envIf you want OCR for scanned/image PDFs:
- Ensure Tesseract is installed on your machine
- Configure these env vars (see
.env.example):OCR_ENABLEDOCR_MODE(tesseract|vision|hybrid)OCR_LANGUAGESOCR_RENDER_DPIOCR_TEXT_CONFIDENCE_THRESHOLDOCR_VISION_FALLBACK_ENABLED(defaultfalse; settrueto enable Vision fallback)TESSERACT_CMD(Windows full path if needed)
make upmake install
make initBackend:
make dev-backendFrontend:
make dev-frontendOpen: http://localhost:3000
Use the UI to upload PDFs. For OCR testing, your scanned assets live in:
image_pdfs/(image/scanned PDFs)
- Keep frontend deployment connected to the
helpdesk-uisource repository. - Set frontend env:
NEXT_PUBLIC_API_URL=https://<your-render-backend>.onrender.com
- Do not leave this env empty, otherwise frontend falls back to localhost.
Use this as Render start command:
python scripts/bootstrap_start.pyWhat this one command does on each deploy:
- Initializes relational schema idempotently (
create_all). - Ensures vector collection exists in Qdrant.
- Starts FastAPI/Uvicorn.
RELATIONAL_DB=postgres
DATABASE_URL=postgres://<user>:<password>@<host>:5432/<db>
DB_SCHEMA=helpdesk_chatbot
VECTOR_DB=qdrant
QDRANT_URL=https://<cluster-id>.<region>.aws.cloud.qdrant.io:6333
QDRANT_API_KEY=<qdrant-cloud-api-key>
QDRANT_COLLECTION=helpdesk_chunks
CORS_ALLOW_ORIGINS=https://<your-vercel-app>.vercel.app,http://localhost:3000Notes:
DATABASE_URLis preferred in production;postgres://is auto-normalized to async SQLAlchemy format.DB_SCHEMAisolates tables per app when sharing one Postgres instance; use a unique schema per project.- Qdrant API key is optional locally, but required for most Qdrant Cloud projects.
- Seed data ingestion is intentionally skipped in production startup (only schema + collection bootstrap).
GET /healthPOST /ingest(PDF upload and ingestion)GET /pdfs/{pdf_name}(serve original PDFs for source links)GET /documents+GET /documents/{document_id}/chunks+DELETE /documents/{document_id}POST /queryandPOST /query/streamPOST /chatandPOST /chat/stream
For the full operational contract and behavior details, see docs/reference/MANUAL.md.
helpdesk-ui/ # Frontend (Next.js)
app/ # FastAPI backend
api/ # Web boundaries + routes (incl. SSE streaming)
db/ # Vector + relational adapters
ingestion/ # PDF parsing, OCR, chunking, ingestion pipeline
llm/ # Provider routing for completion + embeddings
query/ # Hybrid retrieval + reranking + RAG generation
chat/ # Session + chat orchestration
data/
sample_pdfs/ # Demo PDFs
image_pdfs/ # OCR testing PDFs
docs/ # Architecture diagrams + reference manual
tests/ # Unit + integration + e2e tests
scripts/ # Init / seed helpers
MIT (add/adjust if you publish under a different license).
RAG Engine is an IT helpdesk assistant built on a production-grade Retrieval-Augmented Generation (RAG) stack.
It ingests your IT PDFs (VPN, network, SSL, Linux, cloud, device guides, etc.), stores them in a hybrid vector + relational backend, and answers user questions with:
- Grounded responses strictly based on your documentation.
- Confidence scores and cited sources (page-level PDF links).
- Safe refusals when evidence is weak instead of hallucinating.
The system is designed for enterprise environments with configurable providers, databases, and routing behavior.
PDF → Ingestion Pipeline → Vector DB (Qdrant/Milvus) + Relational DB (Postgres/MySQL)
User Question → Query/Chat Pipeline → Router → Hybrid Search → Reranker → RAG Generator → Response
- Ingestion: parses PDFs, chunks text, builds dense + sparse vectors, and persists metadata.
- Query: routes requests, retrieves relevant chunks, reranks, and generates answers with grounding + confidence gating.
- Chat: maintains short-term session history on top of the same retrieval pipeline.
For a deeper, implementation-level explanation, see docs/reference/MANUAL.md.
- Hybrid retrieval: dense embeddings + BM25-style sparse vectors for robust matching.
- Provider failover: Groq, OpenAI, and OpenRouter with automatic fallback for both generation and embeddings.
- Cohere reranking: high-quality reranking when available, with local relevance fallback.
- Strict grounding: prompts and pipelines tuned to prefer “I don’t know” over hallucinations.
- Streaming APIs: Server-Sent Events (SSE) endpoints for both Query and Chat.
- Scanned/Image PDF support:
- Auto-detects
text_pdf,image_pdf, andmixed_pdf. - Uses PyMuPDF for digital text.
- Uses OCR (Tesseract + optional Vision fallback) for image-only pages.
- Auto-detects
- Backend
- Python 3.11+
- Docker + Docker Compose (for local DBs and vector stores)
- Frontend
- Node.js 18+
- API keys
- At least one LLM provider key (Groq recommended – generous free tier)
- Cohere API key (for reranking)
- Configure environment
- Copy the example file:
cp .env.example .env
- Fill in at least:
GROQ_API_KEYCOHERE_API_KEY- DB URLs (or keep defaults for local Docker)
- Copy the example file:
- Start infrastructure
make up
Starts Qdrant/Milvus + Postgres/MySQL (depending on config).
- Install dependencies
make install
Installs Python and frontend dependencies.
- Initialize databases
make init
Creates relational tables + vector collections.
- Run dev environment
- Terminal 1:
make dev-backend - Terminal 2:
make dev-frontend - Open
http://localhost:3000
- Terminal 1:
- Ingest PDFs
- Go to the Documents/ingestion screen and upload PDFs from
data/sample_pdfs/or your own docs.
- Go to the Documents/ingestion screen and upload PDFs from
- Query and Chat
- Use the Query and Chat UIs to ask IT questions and inspect sourced answers.
Some IT manuals are scanned images rather than digital text. The ingestion pipeline handles these transparently:
- Detection
- Inspects page text density with PyMuPDF.
- Classifies documents as:
text_pdf– normal digital text.image_pdf– predominantly scanned/image-based.mixed_pdf– combination, routed page-by-page.
- OCR pipeline
- Primary: local Tesseract OCR (multilingual).
- Optional: Vision-based OCR fallback for low-confidence pages.
- Returns the same
ParsedPagemodel used by the normal parser so chunking and embeddings remain unchanged.
OCR-related env vars (see .env.example for defaults):
OCR_ENABLED– master switch for OCR.OCR_LANGUAGES– e.g.eng+hin.OCR_RENDER_DPI– page render DPI for OCR (e.g.300).OCR_TEXT_CONFIDENCE_THRESHOLD– triggers Vision fallback below this score.OCR_VISION_FALLBACK_ENABLED–falseby default; settrueto enable Vision fallback.OCR_VISION_MODEL– e.g.gpt-4o-mini.TESSERACT_CMD– full path totesseracton Windows if not onPATH.
Behavior:
- With
OCR_VISION_FALLBACK_ENABLED=false(default), only Tesseract is used. - With
true, Vision OCR is used only for low-confidence OCR pages. - If enabled but no valid Vision API key exists, ingestion automatically falls back to Tesseract-only without failing.
Place sample scanned PDFs under image_pdfs/ and ingest them through /ingest or the UI like any other document.
- Health
GET /health
- Ingestion & documents
POST /ingestGET /documentsGET /documents/{document_id}/chunksDELETE /documents/{document_id}GET /pdfs/{pdf_name}– serves original PDFs for source links.
- Query
POST /query– standard request/response.POST /query/stream– SSE streaming (delta+finalevents).
- Chat
POST /chatPOST /chat/stream– SSE streaming.GET /chat/sessionsGET /chat/{session_id}/historyDELETE /chat/{session_id}
Details for each endpoint, payload shape, and behavior are documented in docs/reference/MANUAL.md.
Most behavior is driven by .env and app/config.py:
- LLM
LLM_PROVIDER∈{groq, openrouter, openai}- Provider-specific models and timeouts.
- Embeddings
EMBEDDING_PROVIDER∈{openai, openrouter, cohere}
- Vector / relational DB
VECTOR_DB∈{qdrant, milvus}RELATIONAL_DB∈{postgres, mysql}
- Retrieval & confidence
MAX_CHUNKS_RETURN,RERANK_TOP_N,CONFIDENCE_THRESHOLD, and related relevance thresholds.
- Chat/session
CHAT_HISTORY_TURNS,MAX_SESSIONS.
helpdesk-ui/ # Next.js frontend (query + chat UI, streaming clients)
app/
api/ # FastAPI routes and SSE endpoints
chat/ # Chat pipeline and session handling
config.py # Centralized settings (Pydantic-based)
db/ # Relational + vector store integrations
ingestion/ # PDF parsing, OCR, chunking, and ingestion pipeline
llm/ # LLM + embeddings client with provider failover
models/ # Pydantic models / schemas
query/ # Hybrid search, reranking, RAG generation, pipelines
main.py # FastAPI application entrypoint
docs/
reference/MANUAL.md # Detailed system manual and API behavior
tests/
unit/ # Isolated unit tests
integration/ # API and boundary tests
e2e/ # End-to-end workflow tests
scripts/
init_db.py # Relational DB initialization
init_vector_db.py # Vector DB initialization
seed_demo.py # Demo PDF ingestion helper
data/
sample_pdfs/ # Example IT PDFs for demo and testing
image_pdfs/ # Scanned/Image PDFs for OCR testing
This project is intended as an internal/reference implementation for IT helpdesk RAG systems.
Add your chosen license here before public or commercial deployment.