Customer Support FAQ Assistant

CameronDetig · 2026-02-08T17:39:31Z

Overview

This PR implements a complete spec-driven local RAG (Retrieval-Augmented Generation) assistant for a fictional bank customer support system. The project provides both API and UI interfaces for answering customer questions using local FAQ documents, with no external API dependencies or secrets required.

Core Features

FastAPI Backend - RESTful API with /health, /ask, /db/status, and /db/build endpoints
Streamlit UI - Interactive chat interface with source citations and retrieval transparency
Local RAG Pipeline - ChromaDB vector database with semantic search using all-MiniLM-L6-v2 embeddings
Dual Generation Options:
- mock - Deterministic, test-friendly mode (default)
- flan-t5 - Optional local LLM mode using Google's Flan-T5-small (80M params, instruction-tuned)
Cross-Platform CLI - Single run.py entrypoint for setup, running services, and testing
Comprehensive Test Suite - 14 test files with 100% spec coverage, multi-Python matrix testing (3.10, 3.11, 3.12)
Optional Docker Support - Containerized deployment with docker-compose orchestration
CI/CD Pipeline - GitHub Actions workflow with automated testing

Workflow

I started out by talking with ChatGPT to flesh out the project requirements and learn how spec-driven-development works. Here I also brainstormed ideas and asked questions on how this could be implemented and what the stack should look like. I then had ChatGPT generate a document summarizing what we talked about and the plan for the project. This could then be fed directly to Codex 5.3 to start working on the specifications.

I really enjoyed using SDD as it allowed me to build a solid framework on what the project will be before moving into implementation. In my experience with codegen tools, it is more work to try and fix or add features after the model has made a first pass at the code, so this helped to get everything structured from the beginning. Primarily I used Codex 5.3, but would occasionally have Cursor with Opus 4.6 also take a look at the code to see if it could find any improvements. I typically do this in my projects to essentially get another set of eyes on the code to look for things that the LLM possibly missed or to get a different perspective on things.

Once the specs and tests were completed, I moved into implementation. Codex made a great first pass on the code and then it was just a matter of improving and iterating. As I went, if new features were added I had Codex update the specs and keep everything consistent. Also, from reading more about spec-driven-development online, I decided to make a constitution.md file. This contains information on the intent and architecture of the project to keep the codegen tool on track, and make it easier to onboard other codegen tools to the project.

I tried to make everything self contained with no API keys required. This was a challenge, as most LLMs either require an API key, or multi-GB downloads that wouldn't be suitable for this assessment. Initially I tried using the ~300MB distilgpt2 model that was small enough to easily run locally. However, I found out that it is not instruction-tuned, so it often generated nonsensical outputs. Claude Sonnet 4.6 suggested to switch to the Flan-T5 model that is instruction-tuned, which ended up greatly improving the responses from the assistant.

Overall this was an interesting project and I learned a lot about the benefits of spec-driven-development that I will carry into my other work.

Quick Start

Setup

python run.py setup              # Basic setup (mock mode only)
python run.py setup --with-llm   # Include Flan-T5 model (~308MB)

Run Services

python run.py api                # API only (port 8000)
python run.py ui                 # UI only (port 8501)
python run.py fullstack          # Both together
python run.py help

Testing

python run.py test               # Run pytest
python run.py test-matrix        # Multi-Python tox matrix

Docker (Optional)

python run.py docker-build       # Build images
python run.py docker-fullstack   # Run API + UI
python run.py docker-down        # Stop containers

Test Coverage

All 12 feature specs have corresponding test coverage:

Spec	Test Files	Status
`health-endpoint.md`	`test_health.py`	✅ Pass
`ask-endpoint-validation.md`	`test_validation.py`	✅ Pass
`ask-response-contract.md`	`test_contract.py`	✅ Pass
`retrieval-pipeline.md`	`test_retrieval.py`, `test_db.py`	✅ Pass
`generation-mock.md`	`test_determinism.py`, `test_contract.py`	✅ Pass
`generation-optional-llm.md`	`test_generator_config.py`	✅ Pass
`faq-data.md`	`test_data_loader.py`	✅ Pass
`streamlit-ui.md`	`test_streamlit_smoke.py`, `test_streamlit_ui_logic.py`	✅ Pass
`entrypoint-cli.md`	`test_cli.py`	✅ Pass
`docker-optional.md`	Manual acceptance	✅ Pass
`spec-traceability.md`	N/A (meta-spec)	✅ Complete
`feature-template.md`	N/A (template)	✅ Complete

Code Quality

Adherence to Constitution

✅ Local-first (no external APIs)
✅ Python 3.10-3.12 support
✅ Network-independent tests
✅ Optional LLM (non-blocking)
✅ Simple, readable code paths
✅ Clear module boundaries
✅ Comprehensive error handling

Testing Standards

✅ 52 tests covering all specs
✅ Pytest with clear assertions
✅ Tox matrix (py310, py311, py312)
✅ CI pipeline with automated runs
✅ Deterministic test suite

Deployment

Local Development

python run.py setup --with-llm
python run.py fullstack
# Visit http://localhost:8501

Docker Production

python run.py docker-build
python run.py docker-fullstack
# API: http://localhost:8000
# UI: http://localhost:8501

Dependencies

Core:

fastapi + uvicorn - API framework
streamlit - UI framework
chromadb - Vector database
langchain + langchain-huggingface + langchain-chroma - RAG orchestration
sentence-transformers - Embedding models
transformers - LLM inference
pytest + tox - Testing

Possible Future Enhancements

Use larger LLMs (e.g., Llama, ChatGPT)
Persistent chat history
User authentication
Production deployment guides (AWS, GCP, Azure)
Streaming responses
Multi-language support
Advanced retrieval (hybrid search, reranking)

…t, FAQ corpus data, deterministic mock generation, optional LLM generation, health endpoint, retrieval pipeline, and spec traceability matrix

… specification files for more logical bullet points

… and improve validation criteria

…ses for RAG

…ecifications

…fications.

…tomated installation options feat: Update Streamlit UI to improve UX and add chat functionality test: Add integration tests for entrypoint CLI setup help and Streamlit UI DB status handling

feat: Integrate Langchain support for generation and retrieval

…d tox matrix-testing functionality for testing python 3.10, 3.11, and 3.12

…d implement CI workflow with github actions docs: Updating project constitution on guidelines for development

…command docs: enhance README and entrypoint CLI documentation

…rove distilgpt2 output quality checks in generation

…thub.com/CameronDetig/spec-driven-development into feature/customer-faq-assistant-cameron-d

… for better chat responses

…ME for clarity on features and setup instructions

CameronDetig and others added 22 commits February 7, 2026 16:38

docs: add feature specs for ask endpoint validation, response contrac…

66ffeaa

…t, FAQ corpus data, deterministic mock generation, optional LLM generation, health endpoint, retrieval pipeline, and spec traceability matrix

docs: Adding streamlit specification document, and reformatting other…

3f6feba

… specification files for more logical bullet points

test: Adding testing files to cover the specifications

3687986

docs: Update project brief and specifications, enhance test coverage,…

de0ea84

… and improve validation criteria

data: Generated documents that will be used to fill the vector databa…

be41c47

…ses for RAG

data: updating location on data files, and adding two more

1249b06

docs: Add optional Docker support spec and enhanced entrypoint CLI sp…

ce051a7

…ecifications

feat: First pass on the implementation of the chatbot using the speci…

80060d5

…fications.

feat: Enhance entrypoint CLI with Python interpreter selection and au…

e745dff

…tomated installation options feat: Update Streamlit UI to improve UX and add chat functionality test: Add integration tests for entrypoint CLI setup help and Streamlit UI DB status handling

feat: Integrate Langchain support for generation and retrieval

34a12c2

Merge pull request #1 from CameronDetig/langchain-integration

0135d5f

feat: Integrate Langchain support for generation and retrieval

feat: Update project specifications in the constitution.md file. Adde…

29141ec

…d tox matrix-testing functionality for testing python 3.10, 3.11, and 3.12

feat: Enhance retrieval database setup with build status reporting an…

9e87493

…d implement CI workflow with github actions docs: Updating project constitution on guidelines for development

feat: Update CI workflow to include tox installation and modify test …

c7c404a

…command docs: enhance README and entrypoint CLI documentation

fix: correcting error in workflow file

fc105a6

fix: correcting another error in workflow file

77a3e9e

fix: Added CPU torch index to unblock py310 installs in CI

cf62bd5

feat: Add embedding model download and verification during setup; imp…

5bdf3e3

…rove distilgpt2 output quality checks in generation

Merge branch 'feature/customer-faq-assistant-cameron-d' of https://gi…

ed1afe0

…thub.com/CameronDetig/spec-driven-development into feature/customer-faq-assistant-cameron-d

feat: migrating from distilgpt2 to flan-t5 which is instruction tuned…

d0dc290

… for better chat responses

refactor: Update docker-compose for UI build process and enhance READ…

7d2bf4d

…ME for clarity on features and setup instructions

docs: Adding image of the UI to the readme

3d25de0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Customer Support FAQ Assistant - Cameron D#8