A production-grade, multimodal AI application that learns a family's unique dynamics and coaches parents through the hard moments — built on Claude, Claude Vision, and the Web Speech API.
Hi — I'm Farid, MS Business Analytics & AI @ UT Dallas. This is the project I'm most proud of in my portfolio because it demonstrates end-to-end product thinking: identifying a real user problem, designing a compounding moat, and shipping a working multimodal AI product across 4 build phases.
What to look at if you have 60 seconds:
- The pitch deck — 10 slides, full brand identity, business model
- The architecture diagram below
app.py— a single-file Flask backend with 17 API routes, 12 Claude prompt templates, and a defensive JSON-parsing layer for structured LLM outputs- The Right-Now Mode feature — the killer feature I'd lead with in a product interview
What this project demonstrates:
- ✅ Shipping muscle — 4 phases built in sequence, each verifiably running end-to-end
- ✅ Multimodal AI — text + voice (Web Speech API) + vision (Claude Vision on base64 frames)
- ✅ Product thinking — a real moat (DNA doc + feedback loop + in-moment scripts), not a ChatGPT wrapper
- ✅ LLM engineering — system prompts as architectural primitives; a self-critique loop that rejects generic replies; structured JSON outputs with defensive parsing
- ✅ Full-stack — Flask + SQLite + vanilla JS frontend with zero build step, runs on any laptop
Parenting advice is generic. Every family is specific.
73% of parents feel overwhelmed weekly. $0 is spent on truly personalized coaching. And in the 15 seconds when the kids are screaming, no book or blog post can help.
An AI coach that learns YOUR family through deep conversational intake, remembers your kids by name, sees what's happening through your camera, speaks to you in a warm voice, and gets smarter every week through a feedback loop.
| Feature | What it does | Why it matters |
|---|---|---|
| 💬 Four-voice coaching | Auto-switches between Coach, Friend, Therapist, Witness | Matches parent's emotional state, not just their query |
| 🚨 Right-Now Mode | 30-second de-escalation script when kids are screaming | The moat — nothing else helps in the 90 seconds when everything is on fire |
| 🕊️ Repair Scripts | Kid-specific, age-specific rupture recovery | Secure attachment is built in repair, not in avoiding conflict |
| 📹 Camera + Vision | Record a moment → Claude Vision observes → recommendations | Parents can't always explain what's happening; the camera can |
| 🎙️ Voice mode | Kindred reads replies aloud; parent can speak to it | Hands-free use during actual parenting moments |
| 📈 Trend tracking | Per-child trajectory (improving / mixed / worsening / steady) + narrative | Parents see what's actually working over weeks |
| ✨ Pattern insights | Auto-surfaces non-obvious patterns across recent activity | Things you'd never catch on your own |
| 📊 Weekly digest | Sunday reflection: patterns, wins, watch-fors | Turns chaotic week into learning |
| 🎒 Professional summary | Observational (non-diagnostic) one-pager for teachers/pediatricians | Bridges home context to professionals who can help |
| 👍 Feedback loop | Thumbs up/down trains the coach over time | Compounding moat — more use = more personalized |
| ↻ Try-again | Miss a reply, say why, regenerate with context | Keeps the parent in control of the conversation |
This is the feature I'd lead with in any product interview. It's what makes Kindred different from every parenting blog and every generic chatbot:
- Parent types two sentences about what's happening RIGHT NOW
- Response comes back in under 3 seconds (self-critique is skipped for speed)
- Output is a literal script: exact words to say, what NOT to say, where to put your body
- The script uses the kids' real names and specific temperaments from the family's DNA doc
Technically, this demonstrates prompt engineering as product design: a different prompt template, a different output schema, a different latency profile — same model, different UX. One model, many products.
Three compounding moats:
- The DNA doc — an invisible family profile loaded into every reply. Gets richer every conversation.
- The feedback loop — every 👍 👎 trains the coach. By week 4 it knows your kids better than your pediatrician.
- Multimodal — text, voice, and camera together. Every input method tightens the advice.
None of these can be replicated by a blog, a tracker app, or a generic ChatGPT wrapper.
┌─────────────────────────────────────────────────────────┐
│ INTAKE AGENT (12 turns) │
│ Parent · kids · culture · values · triggers · goals │
└───────────────────────────┬─────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ DNA DOC (invisible, persistent profile) │
└───────────────────────────┬─────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ COACH (4 voices) + SELF-CRITIQUE + FEEDBACK LOOP │
│ DNA + learned_prefs + context → reply → QA │
└───────────────────────────┬─────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ SCENARIO-SPECIFIC MODULES │
│ 🚨 Right-Now 🕊️ Repair 📹 Media analysis │
│ 📊 Digest ✨ Insights 📈 Trends │
│ 🎒 Pro summary │
└───────────────────────────┬─────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ CLIENT-SIDE VOICE LAYER │
│ Web Speech API: SpeechRecognition + SpeechSynthesis │
│ (no audio ever leaves the browser) │
└─────────────────────────────────────────────────────────┘
| Layer | Tech | Why |
|---|---|---|
| LLM | Claude Sonnet 4.6 (via Anthropic SDK) | Long context for DNA doc + history; strong structured JSON outputs |
| Vision | Claude Vision | Multimodal image analysis in the same unified API — no separate CV pipeline |
| Backend | Flask + SQLite | Zero-ceremony local deployment; single file, single DB |
| Frontend | Vanilla JS, no framework | Instant load, no build step, works on any browser |
| Voice | Web Speech API (browser-native) | No OpenAI/ElevenLabs dependency; no audio ever leaves the device |
| Persistence | SQLite with schema migrations | Safe column additions via ALTER TABLE when features ship |
| Route | Purpose |
|---|---|
/api/intake |
Conversational family intake (12 turns) |
/api/finalize_intake |
Generate the invisible DNA doc |
/api/chat |
Main coaching reply with self-critique |
/api/feedback |
Thumbs up/down with reason |
/api/try_again |
Regenerate a missed reply with miss-context |
/api/repair |
Rupture recovery scripts (structured JSON) |
/api/right_now |
Rapid crisis response (no critique, optimized for speed) |
/api/weekly_digest |
7-day reflective summary |
/api/insights |
Auto-detected patterns (14-day window) |
/api/professional_summary |
Observational one-pager for teachers/doctors |
/api/analyze_media |
Claude Vision on video frames + recommendations |
/api/trends |
Longitudinal per-child trajectory (8 weeks) |
/api/voice_config |
Preferred voice profile for client-side TTS |
/api/transcribe |
(Optional) Whisper STT fallback |
/api/speak |
(Optional) ElevenLabs TTS fallback |
/api/status |
Onboarding / config check |
/api/reset |
Wipe family profile |
:: 1. Install Python 3.11+ (check "Add Python to PATH" during install)
:: 2. Get an Anthropic API key at https://console.anthropic.com/
:: 3. Double-click start.bat — it handles venv, deps, and .env setup
:: 4. Open http://localhost:8000python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# edit .env and paste your ANTHROPIC_API_KEY
python app.py
# open http://localhost:8000Every coaching reply runs through a second Claude call with a single question: "Could this reply work for any family, or is it unmistakably about THIS family?" If it returns FAIL, the reply is rewritten with the DNA doc injected more forcefully. This is visible to users with a ✨ "rewritten for specificity" badge.
Why it matters: a design pattern — LLM as its own test harness. Not a chain of thought, not RAG, not an agent loop. A focused QA pass.
When Phase 3 added the insights_cache column to the families table, I didn't drop and recreate the DB. I wrote a migration block that runs PRAGMA table_info on startup and adds missing columns. Users upgrading from Phase 2 keep their family profile and history intact.
The 12 prompt templates are parameterized with .format() at call time — the DNA doc is injected into every one. This lets new features ship as new prompts + new routes, not new models. A scalable pattern for feature velocity.
The Web Speech API runs on the user's device. Voice input is transcribed by the browser, not shipped to an STT service. Voice output uses the OS's built-in voices (Samantha on macOS, Zira on Windows). No extra API bill, no extra credential, no audio data ever leaves the user's computer.
Claude will occasionally wrap JSON in markdown fences or add a preamble, even when prompted not to. Every structured endpoint (/api/repair, /api/insights, /api/trends, etc.) strips markdown fences, retries the parse, and returns a structured error with the raw text if it still fails. Boring-but-essential LLM engineering.
- Phase 5 (Q2 2026): Beta users, co-parent shared access, iOS native wrapper
- Phase 6 (Q3 2026): B2B pilots — schools (parent-teacher conference prep), pediatric clinics (observational summaries)
- Phase 7: Relationship memory — Kindred remembers conversations from months ago, surfaces them contextually
Built by Muhammad Farid — MS Business Analytics & AI at UT Dallas, currently a Data Science & AI Automation Analyst at Slick City Action Park. I build AI agents that solve specific operational problems:
- BriefRoom — Business intelligence dashboard for consultants (Claude + Streamlit)
- CreditSense — MCP-based agentic credit simulator (Node.js + React)
- SlidePark-AI — Demand forecasting + venue ops automation
- RecipeRAG — Hybrid RAG pipeline
Portfolio: muhammadfarid1990.github.io
Reach out: open to AI/ML engineering and applied AI product roles.
Built with Claude. 🟣



