Question Coach

A chat application that guides undergraduate students through the Question Formulation Technique (QFT) — a six-stage process for developing and refining research questions.

Built on a RAG pipeline backed by Qdrant and Ollama, with Gemini for chat generation.

How it works

Concern	Tool
Embeddings (ingestion + search)	Ollama · `nomic-embed-text` (local, auto-pulled)
Chat generation	Gemini API (`gemini-2.5-flash-lite`)
Vector store	Qdrant
Frontend	React + Vite, served via Nginx

Prerequisites

Docker Desktop
A Gemini API key (free tier is sufficient)
Documents to index in inputs/docs/ (.md or .html)

Quick start

1. Clone and export your API key

git clone <repo-url>
cd question-coach
export GEMINI_API_KEY=your-key-here

Docker Compose reads the key directly from your shell — no .env file needed. Add the export to your shell profile (~/.zshrc, ~/.bashrc) to make it permanent.

2. Start the app

docker compose up --build -d

This starts the API and frontend. The knowledge-base services (Qdrant, Ollama) live in a separate compose file and are started automatically by the ingestion scripts in the next step.

3. Start the knowledge base and index your documents

Place .md or .html files in inputs/docs/, then:

./bin/ingest reindex-all

This script starts the KB stack (docker-compose.kb.yml) — Qdrant, Ollama, and a one-shot ollama-pull container that downloads nomic-embed-text (~274 MB, cached in a named volume after the first run) — then runs ingestion once those services are healthy.

Watch KB startup progress:

docker compose -f docker-compose.kb.yml logs -f ollama ollama-pull

Without the KB stack running, the API falls back to direct Gemini chat with no retrieval. Run ./bin/ingest at least once to enable RAG.

4. Open the app

Service	URL
Frontend	http://localhost:3000
API docs	http://localhost:8000/docs
Qdrant dashboard	http://localhost:6333/dashboard

Ingestion commands

All commands run inside Docker via ./bin/ingest <command>.

# Index everything in inputs/docs/ and inputs/fetched/
./bin/ingest reindex-all

# Index only new or changed documents
./bin/ingest process-new

# Fetch a web article and ingest it
./bin/fetch-ingest 'https://example.com/article'

# Search the knowledge base
./bin/ingest search "your query"

# Remove all indexed documents
./bin/ingest clear-all

# List indexed documents
./bin/ingest list-documents

# Show collection stats
./bin/ingest stats

# Save a snapshot of the vector index
./bin/ingest snapshot                          # auto-named to inputs/snapshots/
./bin/ingest snapshot path/to/my-backup.snapshot

# Restore the vector index from a snapshot
./bin/ingest restore path/to/my-backup.snapshot

# Add or update a single document
./bin/ingest add-update inputs/docs/my-file.md

# Check collection status
./bin/ingest check-collection

Snapshots are saved inside the container at the path you specify (relative to /app). Because the inputs/ directory is mounted as a volume, saving to inputs/snapshots/ (the default) writes directly to your local filesystem.

Changing the embedding model

The active model is set in ai-config.yaml:

embedding:
  ollama:
    model: "nomic-embed-text"   # default — 768 dims, ~274 MB
    # mxbai-embed-large         — 1024 dims, ~670 MB, stronger MTEB scores
    # gemma3:2b                 — 2048 dims, ~1.7 GB

To add a model to the auto-pull list, update the entrypoint of the ollama-pull service in docker-compose.kb.yml.

Switching models changes vector dimensions. Clear the old collection first:

./bin/ingest clear-all
# update model in ai-config.yaml and collection_name in vector_db section
./bin/ingest reindex-all

Folder structure

docker-compose.yml       # App stack: API + frontend (+ Caddy for production)
docker-compose.kb.yml    # KB stack: Qdrant + Ollama + ingestion service
inputs/
├── docs/           # Source documents to index (.md, .html)
└── fetched/        # Articles saved by ./bin/fetch-ingest
agents/
├── CONFIG.json     # Agent tone and guardrail settings
└── prompts/
    ├── SYSTEM_PROMPT.md   # Global agent rules (loaded at server startup)
    ├── EXAMPLES.md        # Few-shot examples (appended to every request)
    ├── STATE_SCHEMA.json  # State schema reference for stage handoff
    └── stages/
        ├── manifest.yaml  # Maps stage IDs to prompt files
        └── STAGE_*.md     # Per-stage instructions (one file per stage/sub-stage)
src/             # Ingestion pipeline (embeddings, chunking, vector store)
api/             # FastAPI server
frontend-react/  # React + Vite frontend

Customising the agent

The system prompt is assembled from files in agents/prompts/. See agents/prompts/README.md for the full runtime assembly order and architecture notes.

To change …	Edit
Global agent rules and guardrails	`agents/prompts/SYSTEM_PROMPT.md`
Per-stage instructions	`agents/prompts/stages/STAGE_*.md`
Stage → prompt file mapping	`agents/prompts/stages/manifest.yaml`
Few-shot examples	`agents/prompts/EXAMPLES.md`
Tone and config defaults	`agents/CONFIG.json`
Knowledge base content	ingest documents into Qdrant

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
.github/workflows		.github/workflows
agents		agents
api		api
bin		bin
frontend-react		frontend-react
inputs		inputs
src		src
tests		tests
.envrc		.envrc
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Caddyfile		Caddyfile
LICENSE		LICENSE
README.md		README.md
ai-config.yaml		ai-config.yaml
ai-config.yaml.example		ai-config.yaml.example
docker-compose.kb.yml		docker-compose.kb.yml
docker-compose.yml		docker-compose.yml
fetch_article.py		fetch_article.py
ingest.py		ingest.py
requirements.txt		requirements.txt
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Question Coach

How it works

Prerequisites

Quick start

1. Clone and export your API key

2. Start the app

3. Start the knowledge base and index your documents

4. Open the app

Ingestion commands

Changing the embedding model

Folder structure

Customising the agent

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Question Coach

How it works

Prerequisites

Quick start

1. Clone and export your API key

2. Start the app

3. Start the knowledge base and index your documents

4. Open the app

Ingestion commands

Changing the embedding model

Folder structure

Customising the agent

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages