AI-powered resume and cover letter generation with evidence-grounded, anti-hallucination guardrails. Every generated claim is checked against the candidate's own input — no invented metrics, tools, or achievements.
Live demo: resumeforge-bg29.onrender.com — free Render tier, ~30 s cold start.
Each choice answered a constraint. What I rejected mattered as much as what I picked.
- FastAPI — needed async streaming (SSE for token-by-token cover-letter generation) and OpenAPI docs for free. Not Flask: its async story is bolted-on; SSE through Flask +
gunicornis fragile under load. - Streamlit — solo dev, UI is not the product. Working interactive form in one evening. Not React/Next: a 6-week React build for a portfolio piece is wrong-shaped time investment when the differentiator is the generation pipeline.
- OpenAI GPT-4o — best instruction-following + JSON mode for structured generation; $0.02 per resume keeps the daily ceiling realistic. Not a local 7B model: degrades evidence-grounded prompting hard enough to defeat the anti-hallucination guarantees this project is built on.
- FAISS + SentenceTransformers — 4 MB index, embeddings cost zero per query, swap-able behind
VectorStoreProtocol. Not Pinecone/Weaviate: at this scale a managed vector DB is monthly overhead with no quality lift; the protocol means I can switch if I'm wrong. - Redis — cost ledger needs atomic counters and TTL; both first-class. Not Memcached: no atomic increment for the daily cost ceiling. Cache degrades to no-op if Redis is down — caching is never allowed to be the failure mode that takes the service down.
- MongoDB — profile schema evolved 4× during build; document store made migration a no-op. Not Postgres: would have spent half my dev time on Alembic migrations for a personal-data store with no relational queries.
- arq — async-native, Redis-backed; reuses cache infra. Not Celery: Celery + asyncio fights the runtime; arq's worker model is built for FastAPI's exact concurrency model.
- Helm — values.yaml recruiters can actually run; HPA/PDB/Ingress need versioned templating. Not raw
kubectl apply: manifests are write-only. - OpenTelemetry + Prometheus — vendor-neutral by design;
OTEL_EXPORTER_OTLP_ENDPOINTswaps backends. Not Datadog APM: vendor lock-in is the wrong trade for a $0/mo portfolio project. - Python 3.12 — typing improvements + ~30 % async-throughput gains over 3.10 on the parallel section expander path. The pin is paired with the regression eval — any version bump runs full evals before merging.
What I would change with hindsight: the test/integration stack is heavier than this product needs. v2 drops Mongo for SQLite-with-JSON-columns and skips the worker queue until a real user hits a 30-second generation — which never happened in load testing because the parallel expanders finish in 8s. Two services, not five.
git clone https://github.com/MarwaBS/ResumeForge.git
cd ResumeForge
cp .env.example .env # set OPENAI_API_KEY
docker compose up --build # API :8000 · Streamlit :8501
open http://localhost:8501The live demo is the canonical screenshot. The Streamlit UI lets you paste a profile + a job description and produces a JD-aligned PDF resume + cover letter in roughly 10–15 s on a warm container.
System diagram:
Streamlit UI ──┐
├──▶ services/ ──▶ domain expanders ──▶ LLM (GPT-4o)
FastAPI REST ──┘ │ │
│◀───────── FAISS evidence engine ───────┘
│◀───────── EntityGuard validation ───────
▼
Redis cache + arq job queue
▼
PDF render (Jinja2 + pdfkit)
- 1,800 tests at 81.62 % coverage (75 % gate), plus a golden-dataset eval gate and a drift-regression gate on every PR.
- 7 CI jobs (lint, security, test, integration, evals, helm-lint, docker-build) and a nightly locust perf gate.
- Multi-stage validation pipeline (FAISS evidence index → evidence-grounded prompting → EntityGuard audit → metric anchoring) so the LLM cannot invent skills, tools, or numbers.
- Observability: OTel traces, 12 Prometheus counters/histograms (incl. per-model USD cost), structured JSON logs, daily cost-ceiling circuit breaker.
- Vendor-neutral abstractions:
LLMProtocol(OpenAI / Mock / Anthropic stub) andVectorStoreProtocol(FAISS / NumPy / Qdrant) so swapping a backend is a config change.
docs/ARCHITECTURE.md— full engineering reference: architecture, ML-Ops, ablation study, deployment recipes, troubleshooting.docs/RUNBOOK.md— on-call playbook (latency spikes, cost-ceiling 429s, Redis outages, drift-gate failures).docs/DATA_HANDLING.md— PII handling and data-retention policy.docs/slos.md— service-level objectives and error budgets.docs/decisions/— 6 architecture decision records.CHANGELOG.md— release history.
MIT — see LICENSE.