Skip to content

edxd1250/ResearchOpsCopilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Research Ops Copilot (Local‑First RAG)

A production‑style, local‑first RAG MVP with SQL metadata + vector store, page‑level citations, observability, and a small evaluation harness.

Setup (macOS)

  1. Start Postgres
docker compose up -d
  1. Create tables (once)
psql postgresql://rocp:rocp_pw@localhost:5432/rocp_db -f db/schema.sql
  1. Create a Python env + install deps
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Optional: set OpenAI key for embeddings + generation

export OPENAI_API_KEY=YOUR_KEY

Ingest documents

python -m ingestion.ingest --paths SampleData/pdfs SampleData/docs

Query (CLI)

python -m rag.query --query "What is this document about?"

Evaluation

python -m evaluation.run_eval --eval_file evaluation/examples/eval_demo.jsonl

Notes on Design

  • SQL for metadata & traceability: documents, ingestion_runs, and chunks provide a durable audit trail and idempotency.
  • Vector DB for retrieval: Chroma persists embeddings locally and stores chunk‑level metadata for filtering and citations.
  • Idempotency: documents are de‑duplicated by uri + file_hash/text_hash; re‑ingest skips unchanged files.
  • Grounded answers: output includes citations with doc title, uri, page number, and chunk id.
  • Confidence threshold: if top score < threshold, the system refuses with “I don’t know.”

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages