Skip to content

kphunter/question-coach

Repository files navigation

Question Coach

License: MPL 2.0

A chat application that guides undergraduate students through the Question Formulation Technique (QFT) — a six-stage process for developing and refining research questions.

Built on a RAG pipeline backed by Qdrant and Ollama, with Gemini for chat generation.

How it works

Concern Tool
Embeddings (ingestion + search) Ollama · nomic-embed-text (local, auto-pulled)
Chat generation Gemini API (gemini-2.5-flash-lite)
Vector store Qdrant
Frontend React + Vite, served via Nginx

Prerequisites

  • Docker Desktop
  • A Gemini API key (free tier is sufficient)
  • Documents to index in inputs/docs/ (.md or .html)

Quick start

1. Clone and export your API key

git clone <repo-url>
cd question-coach
export GEMINI_API_KEY=your-key-here

Docker Compose reads the key directly from your shell — no .env file needed. Add the export to your shell profile (~/.zshrc, ~/.bashrc) to make it permanent.

2. Start the app

docker compose up --build -d

This starts the API and frontend. The knowledge-base services (Qdrant, Ollama) live in a separate compose file and are started automatically by the ingestion scripts in the next step.

3. Start the knowledge base and index your documents

Place .md or .html files in inputs/docs/, then:

./bin/ingest reindex-all

This script starts the KB stack (docker-compose.kb.yml) — Qdrant, Ollama, and a one-shot ollama-pull container that downloads nomic-embed-text (~274 MB, cached in a named volume after the first run) — then runs ingestion once those services are healthy.

Watch KB startup progress:

docker compose -f docker-compose.kb.yml logs -f ollama ollama-pull

Without the KB stack running, the API falls back to direct Gemini chat with no retrieval. Run ./bin/ingest at least once to enable RAG.

4. Open the app

Service URL
Frontend http://localhost:3000
API docs http://localhost:8000/docs
Qdrant dashboard http://localhost:6333/dashboard

Ingestion commands

All commands run inside Docker via ./bin/ingest <command>.

# Index everything in inputs/docs/ and inputs/fetched/
./bin/ingest reindex-all

# Index only new or changed documents
./bin/ingest process-new

# Fetch a web article and ingest it
./bin/fetch-ingest 'https://example.com/article'

# Search the knowledge base
./bin/ingest search "your query"

# Remove all indexed documents
./bin/ingest clear-all

# List indexed documents
./bin/ingest list-documents

# Show collection stats
./bin/ingest stats

# Save a snapshot of the vector index
./bin/ingest snapshot                          # auto-named to inputs/snapshots/
./bin/ingest snapshot path/to/my-backup.snapshot

# Restore the vector index from a snapshot
./bin/ingest restore path/to/my-backup.snapshot

# Add or update a single document
./bin/ingest add-update inputs/docs/my-file.md

# Check collection status
./bin/ingest check-collection

Snapshots are saved inside the container at the path you specify (relative to /app). Because the inputs/ directory is mounted as a volume, saving to inputs/snapshots/ (the default) writes directly to your local filesystem.


Changing the embedding model

The active model is set in ai-config.yaml:

embedding:
  ollama:
    model: "nomic-embed-text"   # default — 768 dims, ~274 MB
    # mxbai-embed-large         — 1024 dims, ~670 MB, stronger MTEB scores
    # gemma3:2b                 — 2048 dims, ~1.7 GB

To add a model to the auto-pull list, update the entrypoint of the ollama-pull service in docker-compose.kb.yml.

Switching models changes vector dimensions. Clear the old collection first:

./bin/ingest clear-all
# update model in ai-config.yaml and collection_name in vector_db section
./bin/ingest reindex-all

Folder structure

docker-compose.yml       # App stack: API + frontend (+ Caddy for production)
docker-compose.kb.yml    # KB stack: Qdrant + Ollama + ingestion service
inputs/
├── docs/           # Source documents to index (.md, .html)
└── fetched/        # Articles saved by ./bin/fetch-ingest
agents/
├── CONFIG.json     # Agent tone and guardrail settings
└── prompts/
    ├── SYSTEM_PROMPT.md   # Global agent rules (loaded at server startup)
    ├── EXAMPLES.md        # Few-shot examples (appended to every request)
    ├── STATE_SCHEMA.json  # State schema reference for stage handoff
    └── stages/
        ├── manifest.yaml  # Maps stage IDs to prompt files
        └── STAGE_*.md     # Per-stage instructions (one file per stage/sub-stage)
src/             # Ingestion pipeline (embeddings, chunking, vector store)
api/             # FastAPI server
frontend-react/  # React + Vite frontend

Customising the agent

The system prompt is assembled from files in agents/prompts/. See agents/prompts/README.md for the full runtime assembly order and architecture notes.

To change … Edit
Global agent rules and guardrails agents/prompts/SYSTEM_PROMPT.md
Per-stage instructions agents/prompts/stages/STAGE_*.md
Stage → prompt file mapping agents/prompts/stages/manifest.yaml
Few-shot examples agents/prompts/EXAMPLES.md
Tone and config defaults agents/CONFIG.json
Knowledge base content ingest documents into Qdrant

About

A Python-based chat application to help undergraduate students develop and refine research questions.

Resources

License

Stars

Watchers

Forks

Contributors