CodeNinjaSarthak · CodeNinjaSarthak · Mar 18, 2026 · Mar 18, 2026 · Mar 18, 2026 · Mar 18, 2026
diff --git a/Assets/Landing-page-dark_mode.png b/Assets/Landing-page-dark_mode.png
diff --git a/Assets/Landing_page_light-mode.png b/Assets/Landing_page_light-mode.png
diff --git a/Assets/Working.png b/Assets/Working.png
diff --git a/README.md b/README.md
@@ -1,6 +1,32 @@
 # AI Powered Live Doubt Manager
 
-A production-grade system that helps teachers manage live YouTube teaching sessions at scale. It polls the YouTube live chat in real time, uses Gemini AI to classify and cluster student questions, generates answers, and delivers them back to the teacher's dashboard over WebSocket — all while optionally posting responses directly into the stream.
+> Real-time semantic clustering of YouTube live chat via Gemini embeddings, pgvector cosine search, and a 6-stage Redis worker pipeline — with RAG-augmented answer generation and WebSocket delivery.
+
+> **Portfolio / demo project.** Built to demonstrate full-stack architecture with AI pipelines, real-time WebSockets, and worker-based processing. Not deployed to production.
+
+- **Online nearest-centroid clustering** — each incoming question is embedded (768-dim Gemini vectors), compared against existing cluster centroids via pgvector cosine distance, and assigned or seeded into a new cluster in a single atomic transaction — no batch reprocessing.
+- **Teacher-scoped RAG retrieval** — answer generation queries only the documents uploaded by the session's owner, using the cluster centroid as the search vector so retrieved context matches the cluster's theme, not just one question's phrasing.
+- **Circuit-breaker-protected worker pipeline** — six independent workers connected by Redis ZSET queues with priority scoring, DLQ after 3 retries, and a circuit breaker on every Gemini call that trips open on sustained failures and exports state to Prometheus.
+
+Teachers running live YouTube sessions are bombarded with chat messages — most are noise, but buried in the flood are genuine student questions. This system watches the live chat, uses Gemini AI to find and cluster those questions, and generates grounded answers using teacher-uploaded materials. The result is a real-time dashboard that turns an unreadable chat stream into an organized, actionable Q&A feed — built for educators who teach at scale.
+
+## Screenshots
+
+<table>
+  <tr>
+    <td><img src="Assets/Landing_page_light-mode.png" alt="Landing page — light mode" width="100%"></td>
+    <td><img src="Assets/Landing-page-dark_mode.png" alt="Landing page — dark mode" width="100%"></td>
+  </tr>
+  <tr>
+    <td colspan="2" align="center"><em>Landing page — light &amp; dark mode</em></td>
+  </tr>
+</table>
+
+<p align="center">
+  <img src="Assets/Working.png" alt="Live dashboard" width="100%">
+  <br>
+  <em>Live dashboard — real-time question clustering, AI answers, and YouTube integration</em>
+</p>
 
 ## Stack
 
@@ -13,6 +39,14 @@ A production-grade system that helps teachers manage live YouTube teaching sessi
 | Browser | Chrome extension (TypeScript + Vite) |
 | Infrastructure | Docker Compose (local), Terraform (cloud), Prometheus + Grafana (observability) |
 
+## How It Works
+
+1. **Connect** — Teacher links their YouTube live stream via OAuth and starts a session
+2. **Ingest** — A polling worker pulls new chat messages into a Redis queue every second
+3. **Classify & Embed** — Gemini labels each message as a question or not, then generates a 768-dim embedding vector
+4. **Cluster** — Nearest-centroid grouping via pgvector clusters similar questions together; answer generation triggers at milestone counts (3, 10, 25)
+5. **Answer & Deliver** — RAG-augmented answers are generated from teacher-uploaded documents, pushed to the dashboard over WebSocket, and optionally posted back to YouTube chat
+
 ## Architecture
 
 ```
@@ -46,14 +80,16 @@ Comments flow from YouTube → Redis workers → Gemini AI for classification an
 
 ## Features
 
-- **Real-time question clustering** — student comments are embedded and clustered live using nearest-centroid algorithm with milestone triggers
-- **RAG-augmented answers** — AI-generated answers grounded in teacher-uploaded documents (PDF, DOCX, TXT)
-- **YouTube integration** — polls live chat, posts answers directly back to YouTube
-- **Content moderation** — Gemini-powered filtering before classification and before YouTube posting
-- **WebSocket dashboard** — real-time updates with exponential backoff reconnection and 100-message cap
-- **Teacher isolation** — every data endpoint enforces ownership; RAG retrieval is scoped per teacher
-- **Observability** — Prometheus metrics, circuit breaker pattern on all Gemini calls, structured logging
-- **Scheduled maintenance** — automatic daily quota reset and hourly expired token cleanup
+| Feature | What it does |
+|---|---|
+| Real-time question clustering | Student comments are embedded and clustered live using nearest-centroid algorithm with milestone triggers |
+| RAG-augmented answers | AI-generated answers grounded in teacher-uploaded documents (PDF, DOCX, TXT) |
+| YouTube integration | Polls live chat, posts answers directly back to YouTube |
+| Content moderation | Gemini-powered filtering before classification and before YouTube posting |
+| WebSocket dashboard | Real-time updates with exponential backoff reconnection and 100-message cap |
+| Teacher isolation | Every data endpoint enforces ownership; RAG retrieval is scoped per teacher |
+| Observability | Prometheus metrics, circuit breaker pattern on all Gemini calls, structured logging |
+| Scheduled maintenance | Automatic daily quota reset and hourly expired token cleanup |
 
 ## Quick Start
 
@@ -142,11 +178,17 @@ This opens a tmux session with 9 panes: backend API, 6 AI workers, scheduler, an
 | Method | Path | Description |
 |---|---|---|
 | `GET` | `/health` | Health check |
+| `POST` | `/api/v1/auth/register` | Register a new teacher |
 | `POST` | `/api/v1/auth/login` | Authenticate, returns JWT |
+| `GET` | `/api/v1/auth/me` | Get current authenticated teacher |
 | `GET` | `/api/v1/sessions` | List teacher's sessions |
-| `POST` | `/api/v1/sessions` | Create a new session |
-| `GET` | `/api/v1/clusters` | List question clusters for a session |
-| `POST` | `/api/v1/dashboard/approve` | Approve an AI-generated answer |
+| `POST` | `/api/v1/sessions` | Create a new streaming session |
+| `GET` | `/api/v1/sessions/{id}/clusters` | List question clusters for a session |
+| `GET` | `/api/v1/sessions/{id}/analytics` | Get aggregate session analytics |
+| `POST` | `/api/v1/dashboard/sessions/{id}/manual-question` | Submit a manual question |
+| `POST` | `/api/v1/dashboard/answers/{id}/approve` | Approve an AI-generated answer |
+| `GET` | `/api/v1/dashboard/sessions/{id}/stats` | Get session stats |
+| `POST` | `/api/v1/rag/documents` | Upload a document for RAG retrieval |
 | `GET` | `/api/v1/youtube/auth/url` | Start YouTube OAuth flow |
 | `WS` | `/ws/{session_id}` | Real-time event stream |
 
@@ -163,7 +205,7 @@ make test     # run tests
 ## Known Limitations
 
 - **No production deployment config** — docker-compose is development-oriented; nginx and production Dockerfile are not included
-- **Chrome extension** — functional but not published to the Chrome Web Store
+- **Chrome extension** — functional but currently in testing
 - **YouTube quota** — the YouTube Data API v3 has daily quota limits; high-traffic sessions may hit limits
 - **Single-region** — no multi-region or horizontal scaling configuration
 

diff --git a/backend/alembic/env.py b/backend/alembic/env.py
@@ -10,8 +10,10 @@
     pool,
 )
 
-# Add parent directory to path
-sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+# Add backend/ and project root to path so both app and workers are importable
+_backend_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+sys.path.insert(0, _backend_dir)
+sys.path.insert(0, os.path.dirname(_backend_dir))
 
 from app.core.config import settings
 from app.db.base import Base

diff --git a/backend/alembic/versions/e5f6a7b8c9d0_make_youtube_video_id_nullable.py b/backend/alembic/versions/e5f6a7b8c9d0_make_youtube_video_id_nullable.py
@@ -0,0 +1,24 @@
+"""make youtube_video_id nullable
+
+Revision ID: e5f6a7b8c9d0
+Revises: 6fe440076f64, d4e5f6a7b8c9
+Create Date: 2026-03-18 12:00:00.000000
+
+"""
+
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision = "e5f6a7b8c9d0"
+down_revision = ("6fe440076f64", "d4e5f6a7b8c9")
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.alter_column("streaming_sessions", "youtube_video_id", nullable=True)
+
+
+def downgrade() -> None:
+    op.execute("UPDATE streaming_sessions SET youtube_video_id = '' WHERE youtube_video_id IS NULL")
+    op.alter_column("streaming_sessions", "youtube_video_id", nullable=False)
diff --git a/backend/app/db/models/streaming_session.py b/backend/app/db/models/streaming_session.py
@@ -27,7 +27,7 @@ class StreamingSession(Base):
 
     id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4, index=True)
     teacher_id = Column(UUID(as_uuid=True), ForeignKey("teachers.id", ondelete="CASCADE"), nullable=False, index=True)
-    youtube_video_id = Column(String(255), nullable=False, index=True)
+    youtube_video_id = Column(String(255), nullable=True, index=True)
     title = Column(String(500), nullable=True)
     description = Column(Text, nullable=True)
     is_active = Column(Boolean, default=True, nullable=False)

diff --git a/backend/app/schemas/session.py b/backend/app/schemas/session.py
@@ -16,7 +16,7 @@
 class SessionCreate(BaseModel):
     """Session creation schema."""
 
-    youtube_video_id: str
+    youtube_video_id: Optional[str] = None
     title: Optional[str] = None
     description: Optional[str] = None
 
@@ -35,7 +35,7 @@ class SessionResponse(BaseModel):
 
     id: UUID
     teacher_id: UUID
-    youtube_video_id: str
+    youtube_video_id: Optional[str]
     title: Optional[str]
     description: Optional[str]
     is_active: bool

diff --git a/docs/README.md b/docs/README.md
@@ -97,10 +97,10 @@ Complete listing of every file in this docs repository:
 **Workers**
 - [workers/overview.md](workers/overview.md) — QueueManager, retry, DLQ, runner.py
 - [workers/classification.md](workers/classification.md) — ClassificationPayload → is_question
-- [workers/embeddings.md](workers/embeddings.md) — EmbeddingPayload → 1536-dim pgvector
-- [workers/clustering.md](workers/clustering.md) — Cosine similarity at 0.8 → Cluster CRUD
+- [workers/embeddings.md](workers/embeddings.md) — EmbeddingPayload → 768-dim pgvector
+- [workers/clustering.md](workers/clustering.md) — Cosine similarity at 0.65 → Cluster CRUD
 - [workers/answer-generation.md](workers/answer-generation.md) — RAG + LLM → Answer record
-- [workers/trigger-monitor.md](workers/trigger-monitor.md) — Count/interval thresholds → clustering dispatch
+- [workers/trigger-monitor.md](workers/trigger-monitor.md) — **Stub** (not implemented, not started by start_dev.sh)
 - [workers/youtube-polling.md](workers/youtube-polling.md) — ThreadPoolExecutor, chat_id cache, dedup
 - [workers/youtube-posting.md](workers/youtube-posting.md) — YouTubePostingPayload → YouTube API
 
@@ -132,7 +132,7 @@ Complete listing of every file in this docs repository:
 - [infra/configuration-reference.md](infra/configuration-reference.md) — Master table of ALL env vars
 
 **Observability**
-- [observability/metrics.md](observability/metrics.md) — All 6 Prometheus metrics with labels + PromQL
+- [observability/metrics.md](observability/metrics.md) — All Prometheus metrics with labels + PromQL
 - [observability/logging.md](observability/logging.md) — Log levels, JSON format, RequestContextMiddleware
 - [observability/dashboards.md](observability/dashboards.md) — 4 Grafana dashboards: panels, import
 - [observability/alerting.md](observability/alerting.md) — prometheus/rules.yml, thresholds, incident checklist

diff --git a/docs/SYSTEM_DESIGN_REPORT.md b/docs/SYSTEM_DESIGN_REPORT.md
@@ -69,9 +69,9 @@ Comments (M) ──▶ (1) Clusters  [SET NULL on cluster delete]
 
 | Model | Column | Dimensions | Index | Purpose |
 |-------|--------|-----------|-------|---------|
-| `comments` | `embedding` | Vector(768) | None (sequential scan) | Semantic similarity for clustering |
-| `clusters` | `centroid_embedding` | Vector(768) | None (sequential scan) | Running centroid for nearest-centroid matching |
-| `rag_documents` | `embedding` | Vector(768) | None (sequential scan) | Cosine distance retrieval for answer grounding |
+| `comments` | `embedding` | Vector(768) | HNSW (cosine_ops, m=16, ef=64) | Semantic similarity for clustering |
+| `clusters` | `centroid_embedding` | Vector(768) | HNSW (cosine_ops, m=16, ef=64) | Running centroid for nearest-centroid matching |
+| `rag_documents` | `embedding` | Vector(768) | HNSW (cosine_ops, m=16, ef=64) | Cosine distance retrieval for answer grounding |
 
 **Why 768 dimensions (not Gemini's native 3072)**: 768 is the balance point — reduces storage and index cost by ~75% while retaining sufficient semantic fidelity for question clustering (not fine-grained retrieval). The code explicitly normalizes embeddings post-generation because Google requires normalization for non-3072 dimensions.
 
@@ -333,11 +333,11 @@ Pub/sub delivers only to currently connected subscribers. API restart = lost eve
 
 **Mitigation**: Replace pub/sub with Redis Streams (`XADD`/`XREAD` with consumer groups). Streams persist messages and support "read from last acknowledged" semantics.
 
-### 7.3 No pgvector Index on Vector Columns — **SEVERITY: MEDIUM (time-bomb)**
+### 7.3 pgvector HNSW Index Tuning — **SEVERITY: LOW (resolved)**
 
-Clustering worker's `ORDER BY centroid_embedding <=> :emb` performs sequential scan. Fine at O(100) clusters. Becomes a latency cliff at O(10,000+).
+HNSW indexes have been added to all three vector columns (`comments.embedding`, `clusters.centroid_embedding`, `rag_documents.embedding`) using `vector_cosine_ops` with tuning parameters `m=16, ef_construction=64`. This resolves the original sequential scan concern.
 
-**Mitigation**: Add HNSW index on `clusters.centroid_embedding` and `rag_documents.embedding`. Migration files exist for HNSW indexes but may not cover all vector columns.
+**Remaining risk**: At very high scale (O(100K+) vectors per table), HNSW `ef_search` may need tuning for query latency. Monitor clustering worker query times.
 
 ### 7.4 YouTube Quota Exhaustion — **SEVERITY: MEDIUM**
 
@@ -445,7 +445,7 @@ Workers (100 connections) → PgBouncer (20 connections) → PostgreSQL
 | Embeddings | Local model fallback (e.g., `all-MiniLM-L6-v2`) | Gemini cost > $100/day |
 | YouTube quota | Quota tiering (premium teachers get 50K/day) | Teachers hitting limits daily |
 | Worker autoscaling | Scale on queue depth (>100 tasks → spawn pod) | Sustained backlog |
-| pgvector | Add HNSW index on vector columns | >10K clusters per session |
+| pgvector | Tune HNSW `ef_search` parameter | >100K vectors per table |
 | Redis | Redis Cluster (sharding) | >1GB memory or >50K ops/sec |
 | API | Multiple uvicorn instances behind load balancer | >500 concurrent WebSocket connections |