Skip to content

Customer Support FAQ Assistant - Cameron D#8

Open
CameronDetig wants to merge 22 commits intoautomationExamples:mainfrom
CameronDetig:feature/customer-faq-assistant-cameron-d
Open

Customer Support FAQ Assistant - Cameron D#8
CameronDetig wants to merge 22 commits intoautomationExamples:mainfrom
CameronDetig:feature/customer-faq-assistant-cameron-d

Conversation

@CameronDetig
Copy link
Copy Markdown

@CameronDetig CameronDetig commented Feb 8, 2026

Customer Support FAQ Assistant

Overview

This PR implements a complete spec-driven local RAG (Retrieval-Augmented Generation) assistant for a fictional bank customer support system. The project provides both API and UI interfaces for answering customer questions using local FAQ documents, with no external API dependencies or secrets required.

Core Features

  • FastAPI Backend - RESTful API with /health, /ask, /db/status, and /db/build endpoints
  • Streamlit UI - Interactive chat interface with source citations and retrieval transparency
  • Local RAG Pipeline - ChromaDB vector database with semantic search using all-MiniLM-L6-v2 embeddings
  • Dual Generation Options:
    • mock - Deterministic, test-friendly mode (default)
    • flan-t5 - Optional local LLM mode using Google's Flan-T5-small (80M params, instruction-tuned)
  • Cross-Platform CLI - Single run.py entrypoint for setup, running services, and testing
  • Comprehensive Test Suite - 14 test files with 100% spec coverage, multi-Python matrix testing (3.10, 3.11, 3.12)
  • Optional Docker Support - Containerized deployment with docker-compose orchestration
  • CI/CD Pipeline - GitHub Actions workflow with automated testing

Workflow

 I started out by talking with ChatGPT to flesh out the project requirements and learn how spec-driven-development works. Here I also brainstormed ideas and asked questions on how this could be implemented and what the stack should look like. I then had ChatGPT generate a document summarizing what we talked about and the plan for the project. This could then be fed directly to Codex 5.3 to start working on the specifications.

 I really enjoyed using SDD as it allowed me to build a solid framework on what the project will be before moving into implementation. In my experience with codegen tools, it is more work to try and fix or add features after the model has made a first pass at the code, so this helped to get everything structured from the beginning. Primarily I used Codex 5.3, but would occasionally have Cursor with Opus 4.6 also take a look at the code to see if it could find any improvements. I typically do this in my projects to essentially get another set of eyes on the code to look for things that the LLM possibly missed or to get a different perspective on things.

 Once the specs and tests were completed, I moved into implementation. Codex made a great first pass on the code and then it was just a matter of improving and iterating. As I went, if new features were added I had Codex update the specs and keep everything consistent. Also, from reading more about spec-driven-development online, I decided to make a constitution.md file. This contains information on the intent and architecture of the project to keep the codegen tool on track, and make it easier to onboard other codegen tools to the project.

 I tried to make everything self contained with no API keys required. This was a challenge, as most LLMs either require an API key, or multi-GB downloads that wouldn't be suitable for this assessment. Initially I tried using the ~300MB distilgpt2 model that was small enough to easily run locally. However, I found out that it is not instruction-tuned, so it often generated nonsensical outputs. Claude Sonnet 4.6 suggested to switch to the Flan-T5 model that is instruction-tuned, which ended up greatly improving the responses from the assistant.

 Overall this was an interesting project and I learned a lot about the benefits of spec-driven-development that I will carry into my other work.

Quick Start

Setup

python run.py setup              # Basic setup (mock mode only)
python run.py setup --with-llm   # Include Flan-T5 model (~308MB)

Run Services

python run.py api                # API only (port 8000)
python run.py ui                 # UI only (port 8501)
python run.py fullstack          # Both together
python run.py help

Testing

python run.py test               # Run pytest
python run.py test-matrix        # Multi-Python tox matrix

Docker (Optional)

python run.py docker-build       # Build images
python run.py docker-fullstack   # Run API + UI
python run.py docker-down        # Stop containers

Test Coverage

All 12 feature specs have corresponding test coverage:

Spec Test Files Status
health-endpoint.md test_health.py ✅ Pass
ask-endpoint-validation.md test_validation.py ✅ Pass
ask-response-contract.md test_contract.py ✅ Pass
retrieval-pipeline.md test_retrieval.py, test_db.py ✅ Pass
generation-mock.md test_determinism.py, test_contract.py ✅ Pass
generation-optional-llm.md test_generator_config.py ✅ Pass
faq-data.md test_data_loader.py ✅ Pass
streamlit-ui.md test_streamlit_smoke.py, test_streamlit_ui_logic.py ✅ Pass
entrypoint-cli.md test_cli.py ✅ Pass
docker-optional.md Manual acceptance ✅ Pass
spec-traceability.md N/A (meta-spec) ✅ Complete
feature-template.md N/A (template) ✅ Complete

Code Quality

Adherence to Constitution

  • ✅ Local-first (no external APIs)
  • ✅ Python 3.10-3.12 support
  • ✅ Network-independent tests
  • ✅ Optional LLM (non-blocking)
  • ✅ Simple, readable code paths
  • ✅ Clear module boundaries
  • ✅ Comprehensive error handling

Testing Standards

  • ✅ 52 tests covering all specs
  • ✅ Pytest with clear assertions
  • ✅ Tox matrix (py310, py311, py312)
  • ✅ CI pipeline with automated runs
  • ✅ Deterministic test suite

Deployment

Local Development

python run.py setup --with-llm
python run.py fullstack
# Visit http://localhost:8501

Docker Production

python run.py docker-build
python run.py docker-fullstack
# API: http://localhost:8000
# UI: http://localhost:8501

Dependencies

Core:

  • fastapi + uvicorn - API framework
  • streamlit - UI framework
  • chromadb - Vector database
  • langchain + langchain-huggingface + langchain-chroma - RAG orchestration
  • sentence-transformers - Embedding models
  • transformers - LLM inference
  • pytest + tox - Testing

Possible Future Enhancements

  • Use larger LLMs (e.g., Llama, ChatGPT)
  • Persistent chat history
  • User authentication
  • Production deployment guides (AWS, GCP, Azure)
  • Streaming responses
  • Multi-language support
  • Advanced retrieval (hybrid search, reranking)

CameronDetig and others added 22 commits February 7, 2026 16:38
…t, FAQ corpus data, deterministic mock generation, optional LLM generation, health endpoint, retrieval pipeline, and spec traceability matrix
… specification files for more logical bullet points
…tomated installation options

feat: Update Streamlit UI to improve UX and add chat functionality

test: Add integration tests for entrypoint CLI setup help and Streamlit UI DB status handling
feat: Integrate Langchain support for generation and retrieval
…d tox matrix-testing functionality for testing python 3.10, 3.11, and 3.12
…d implement CI workflow with github actions

docs: Updating project constitution on guidelines for development
…command

docs: enhance README and entrypoint CLI documentation
…rove distilgpt2 output quality checks in generation
…ME for clarity on features and setup instructions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant