diff --git a/Assets/Landing-page-dark_mode.png b/Assets/Landing-page-dark_mode.png
new file mode 100644
index 0000000..b89a72a
Binary files /dev/null and b/Assets/Landing-page-dark_mode.png differ
diff --git a/Assets/Landing_page_light-mode.png b/Assets/Landing_page_light-mode.png
new file mode 100644
index 0000000..10bb32b
Binary files /dev/null and b/Assets/Landing_page_light-mode.png differ
diff --git a/Assets/Working.png b/Assets/Working.png
new file mode 100644
index 0000000..bc3fde1
Binary files /dev/null and b/Assets/Working.png differ
diff --git a/README.md b/README.md
index 13c1cfb..250bce2 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,32 @@
 # AI Powered Live Doubt Manager
 
-A production-grade system that helps teachers manage live YouTube teaching sessions at scale. It polls the YouTube live chat in real time, uses Gemini AI to classify and cluster student questions, generates answers, and delivers them back to the teacher's dashboard over WebSocket — all while optionally posting responses directly into the stream.
+> Real-time semantic clustering of YouTube live chat via Gemini embeddings, pgvector cosine search, and a 6-stage Redis worker pipeline — with RAG-augmented answer generation and WebSocket delivery.
+
+> **Portfolio / demo project.** Built to demonstrate full-stack architecture with AI pipelines, real-time WebSockets, and worker-based processing. Not deployed to production.
+
+- **Online nearest-centroid clustering** — each incoming question is embedded (768-dim Gemini vectors), compared against existing cluster centroids via pgvector cosine distance, and assigned or seeded into a new cluster in a single atomic transaction — no batch reprocessing.
+- **Teacher-scoped RAG retrieval** — answer generation queries only the documents uploaded by the session's owner, using the cluster centroid as the search vector so retrieved context matches the cluster's theme, not just one question's phrasing.
+- **Circuit-breaker-protected worker pipeline** — six independent workers connected by Redis ZSET queues with priority scoring, DLQ after 3 retries, and a circuit breaker on every Gemini call that trips open on sustained failures and exports state to Prometheus.
+
+Teachers running live YouTube sessions are bombarded with chat messages — most are noise, but buried in the flood are genuine student questions. This system watches the live chat, uses Gemini AI to find and cluster those questions, and generates grounded answers using teacher-uploaded materials. The result is a real-time dashboard that turns an unreadable chat stream into an organized, actionable Q&A feed — built for educators who teach at scale.
+
+## Screenshots
+
+<table>
+  <tr>
+    <td><img src="Assets/Landing_page_light-mode.png" alt="Landing page — light mode" width="100%"></td>
+    <td><img src="Assets/Landing-page-dark_mode.png" alt="Landing page — dark mode" width="100%"></td>
+  </tr>
+  <tr>
+    <td colspan="2" align="center"><em>Landing page — light &amp; dark mode</em></td>
+  </tr>
+</table>
+
+<p align="center">
+  <img src="Assets/Working.png" alt="Live dashboard" width="100%">
+  <br>
+  <em>Live dashboard — real-time question clustering, AI answers, and YouTube integration</em>
+</p>
 
 ## Stack
 
@@ -13,6 +39,14 @@ A production-grade system that helps teachers manage live YouTube teaching sessi
 | Browser | Chrome extension (TypeScript + Vite) |
 | Infrastructure | Docker Compose (local), Terraform (cloud), Prometheus + Grafana (observability) |
 
+## How It Works
+
+1. **Connect** — Teacher links their YouTube live stream via OAuth and starts a session
+2. **Ingest** — A polling worker pulls new chat messages into a Redis queue every second
+3. **Classify & Embed** — Gemini labels each message as a question or not, then generates a 768-dim embedding vector
+4. **Cluster** — Nearest-centroid grouping via pgvector clusters similar questions together; answer generation triggers at milestone counts (3, 10, 25)
+5. **Answer & Deliver** — RAG-augmented answers are generated from teacher-uploaded documents, pushed to the dashboard over WebSocket, and optionally posted back to YouTube chat
+
 ## Architecture
 
 ```
@@ -46,14 +80,16 @@ Comments flow from YouTube → Redis workers → Gemini AI for classification an
 
 ## Features
 
-- **Real-time question clustering** — student comments are embedded and clustered live using nearest-centroid algorithm with milestone triggers
-- **RAG-augmented answers** — AI-generated answers grounded in teacher-uploaded documents (PDF, DOCX, TXT)
-- **YouTube integration** — polls live chat, posts answers directly back to YouTube
-- **Content moderation** — Gemini-powered filtering before classification and before YouTube posting
-- **WebSocket dashboard** — real-time updates with exponential backoff reconnection and 100-message cap
-- **Teacher isolation** — every data endpoint enforces ownership; RAG retrieval is scoped per teacher
-- **Observability** — Prometheus metrics, circuit breaker pattern on all Gemini calls, structured logging
-- **Scheduled maintenance** — automatic daily quota reset and hourly expired token cleanup
+| Feature | What it does |
+|---|---|
+| Real-time question clustering | Student comments are embedded and clustered live using nearest-centroid algorithm with milestone triggers |
+| RAG-augmented answers | AI-generated answers grounded in teacher-uploaded documents (PDF, DOCX, TXT) |
+| YouTube integration | Polls live chat, posts answers directly back to YouTube |
+| Content moderation | Gemini-powered filtering before classification and before YouTube posting |
+| WebSocket dashboard | Real-time updates with exponential backoff reconnection and 100-message cap |
+| Teacher isolation | Every data endpoint enforces ownership; RAG retrieval is scoped per teacher |
+| Observability | Prometheus metrics, circuit breaker pattern on all Gemini calls, structured logging |
+| Scheduled maintenance | Automatic daily quota reset and hourly expired token cleanup |
 
 ## Quick Start
 
@@ -142,11 +178,17 @@ This opens a tmux session with 9 panes: backend API, 6 AI workers, scheduler, an
 | Method | Path | Description |
 |---|---|---|
 | `GET` | `/health` | Health check |
+| `POST` | `/api/v1/auth/register` | Register a new teacher |
 | `POST` | `/api/v1/auth/login` | Authenticate, returns JWT |
+| `GET` | `/api/v1/auth/me` | Get current authenticated teacher |
 | `GET` | `/api/v1/sessions` | List teacher's sessions |
-| `POST` | `/api/v1/sessions` | Create a new session |
-| `GET` | `/api/v1/clusters` | List question clusters for a session |
-| `POST` | `/api/v1/dashboard/approve` | Approve an AI-generated answer |
+| `POST` | `/api/v1/sessions` | Create a new streaming session |
+| `GET` | `/api/v1/sessions/{id}/clusters` | List question clusters for a session |
+| `GET` | `/api/v1/sessions/{id}/analytics` | Get aggregate session analytics |
+| `POST` | `/api/v1/dashboard/sessions/{id}/manual-question` | Submit a manual question |
+| `POST` | `/api/v1/dashboard/answers/{id}/approve` | Approve an AI-generated answer |
+| `GET` | `/api/v1/dashboard/sessions/{id}/stats` | Get session stats |
+| `POST` | `/api/v1/rag/documents` | Upload a document for RAG retrieval |
 | `GET` | `/api/v1/youtube/auth/url` | Start YouTube OAuth flow |
 | `WS` | `/ws/{session_id}` | Real-time event stream |
 
@@ -163,7 +205,7 @@ make test     # run tests
 ## Known Limitations
 
 - **No production deployment config** — docker-compose is development-oriented; nginx and production Dockerfile are not included
-- **Chrome extension** — functional but not published to the Chrome Web Store
+- **Chrome extension** — functional but currently in testing
 - **YouTube quota** — the YouTube Data API v3 has daily quota limits; high-traffic sessions may hit limits
 - **Single-region** — no multi-region or horizontal scaling configuration
 
diff --git a/backend/alembic/env.py b/backend/alembic/env.py
index 122bd55..0dc0968 100644
--- a/backend/alembic/env.py
+++ b/backend/alembic/env.py
@@ -10,8 +10,10 @@
     pool,
 )
 
-# Add parent directory to path
-sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+# Add backend/ and project root to path so both app and workers are importable
+_backend_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+sys.path.insert(0, _backend_dir)
+sys.path.insert(0, os.path.dirname(_backend_dir))
 
 from app.core.config import settings
 from app.db.base import Base
diff --git a/backend/alembic/versions/e5f6a7b8c9d0_make_youtube_video_id_nullable.py b/backend/alembic/versions/e5f6a7b8c9d0_make_youtube_video_id_nullable.py
new file mode 100644
index 0000000..ae34c0c
--- /dev/null
+++ b/backend/alembic/versions/e5f6a7b8c9d0_make_youtube_video_id_nullable.py
@@ -0,0 +1,24 @@
+"""make youtube_video_id nullable
+
+Revision ID: e5f6a7b8c9d0
+Revises: 6fe440076f64, d4e5f6a7b8c9
+Create Date: 2026-03-18 12:00:00.000000
+
+"""
+
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision = "e5f6a7b8c9d0"
+down_revision = ("6fe440076f64", "d4e5f6a7b8c9")
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.alter_column("streaming_sessions", "youtube_video_id", nullable=True)
+
+
+def downgrade() -> None:
+    op.execute("UPDATE streaming_sessions SET youtube_video_id = '' WHERE youtube_video_id IS NULL")
+    op.alter_column("streaming_sessions", "youtube_video_id", nullable=False)
diff --git a/backend/app/db/models/streaming_session.py b/backend/app/db/models/streaming_session.py
index d4a1174..6188ec6 100644
--- a/backend/app/db/models/streaming_session.py
+++ b/backend/app/db/models/streaming_session.py
@@ -27,7 +27,7 @@ class StreamingSession(Base):
 
     id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4, index=True)
     teacher_id = Column(UUID(as_uuid=True), ForeignKey("teachers.id", ondelete="CASCADE"), nullable=False, index=True)
-    youtube_video_id = Column(String(255), nullable=False, index=True)
+    youtube_video_id = Column(String(255), nullable=True, index=True)
     title = Column(String(500), nullable=True)
     description = Column(Text, nullable=True)
     is_active = Column(Boolean, default=True, nullable=False)
diff --git a/backend/app/schemas/session.py b/backend/app/schemas/session.py
index c4c7ccc..f57fabb 100644
--- a/backend/app/schemas/session.py
+++ b/backend/app/schemas/session.py
@@ -16,7 +16,7 @@
 class SessionCreate(BaseModel):
     """Session creation schema."""
 
-    youtube_video_id: str
+    youtube_video_id: Optional[str] = None
     title: Optional[str] = None
     description: Optional[str] = None
 
@@ -35,7 +35,7 @@ class SessionResponse(BaseModel):
 
     id: UUID
     teacher_id: UUID
-    youtube_video_id: str
+    youtube_video_id: Optional[str]
     title: Optional[str]
     description: Optional[str]
     is_active: bool
diff --git a/docs/README.md b/docs/README.md
index 885c856..0fbc45b 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -97,10 +97,10 @@ Complete listing of every file in this docs repository:
 **Workers**
 - [workers/overview.md](workers/overview.md) — QueueManager, retry, DLQ, runner.py
 - [workers/classification.md](workers/classification.md) — ClassificationPayload → is_question
-- [workers/embeddings.md](workers/embeddings.md) — EmbeddingPayload → 1536-dim pgvector
-- [workers/clustering.md](workers/clustering.md) — Cosine similarity at 0.8 → Cluster CRUD
+- [workers/embeddings.md](workers/embeddings.md) — EmbeddingPayload → 768-dim pgvector
+- [workers/clustering.md](workers/clustering.md) — Cosine similarity at 0.65 → Cluster CRUD
 - [workers/answer-generation.md](workers/answer-generation.md) — RAG + LLM → Answer record
-- [workers/trigger-monitor.md](workers/trigger-monitor.md) — Count/interval thresholds → clustering dispatch
+- [workers/trigger-monitor.md](workers/trigger-monitor.md) — **Stub** (not implemented, not started by start_dev.sh)
 - [workers/youtube-polling.md](workers/youtube-polling.md) — ThreadPoolExecutor, chat_id cache, dedup
 - [workers/youtube-posting.md](workers/youtube-posting.md) — YouTubePostingPayload → YouTube API
 
@@ -132,7 +132,7 @@ Complete listing of every file in this docs repository:
 - [infra/configuration-reference.md](infra/configuration-reference.md) — Master table of ALL env vars
 
 **Observability**
-- [observability/metrics.md](observability/metrics.md) — All 6 Prometheus metrics with labels + PromQL
+- [observability/metrics.md](observability/metrics.md) — All Prometheus metrics with labels + PromQL
 - [observability/logging.md](observability/logging.md) — Log levels, JSON format, RequestContextMiddleware
 - [observability/dashboards.md](observability/dashboards.md) — 4 Grafana dashboards: panels, import
 - [observability/alerting.md](observability/alerting.md) — prometheus/rules.yml, thresholds, incident checklist
diff --git a/docs/SYSTEM_DESIGN_REPORT.md b/docs/SYSTEM_DESIGN_REPORT.md
index ae446f0..4d6ad8e 100644
--- a/docs/SYSTEM_DESIGN_REPORT.md
+++ b/docs/SYSTEM_DESIGN_REPORT.md
@@ -69,9 +69,9 @@ Comments (M) ──▶ (1) Clusters  [SET NULL on cluster delete]
 
 | Model | Column | Dimensions | Index | Purpose |
 |-------|--------|-----------|-------|---------|
-| `comments` | `embedding` | Vector(768) | None (sequential scan) | Semantic similarity for clustering |
-| `clusters` | `centroid_embedding` | Vector(768) | None (sequential scan) | Running centroid for nearest-centroid matching |
-| `rag_documents` | `embedding` | Vector(768) | None (sequential scan) | Cosine distance retrieval for answer grounding |
+| `comments` | `embedding` | Vector(768) | HNSW (cosine_ops, m=16, ef=64) | Semantic similarity for clustering |
+| `clusters` | `centroid_embedding` | Vector(768) | HNSW (cosine_ops, m=16, ef=64) | Running centroid for nearest-centroid matching |
+| `rag_documents` | `embedding` | Vector(768) | HNSW (cosine_ops, m=16, ef=64) | Cosine distance retrieval for answer grounding |
 
 **Why 768 dimensions (not Gemini's native 3072)**: 768 is the balance point — reduces storage and index cost by ~75% while retaining sufficient semantic fidelity for question clustering (not fine-grained retrieval). The code explicitly normalizes embeddings post-generation because Google requires normalization for non-3072 dimensions.
 
@@ -333,11 +333,11 @@ Pub/sub delivers only to currently connected subscribers. API restart = lost eve
 
 **Mitigation**: Replace pub/sub with Redis Streams (`XADD`/`XREAD` with consumer groups). Streams persist messages and support "read from last acknowledged" semantics.
 
-### 7.3 No pgvector Index on Vector Columns — **SEVERITY: MEDIUM (time-bomb)**
+### 7.3 pgvector HNSW Index Tuning — **SEVERITY: LOW (resolved)**
 
-Clustering worker's `ORDER BY centroid_embedding <=> :emb` performs sequential scan. Fine at O(100) clusters. Becomes a latency cliff at O(10,000+).
+HNSW indexes have been added to all three vector columns (`comments.embedding`, `clusters.centroid_embedding`, `rag_documents.embedding`) using `vector_cosine_ops` with tuning parameters `m=16, ef_construction=64`. This resolves the original sequential scan concern.
 
-**Mitigation**: Add HNSW index on `clusters.centroid_embedding` and `rag_documents.embedding`. Migration files exist for HNSW indexes but may not cover all vector columns.
+**Remaining risk**: At very high scale (O(100K+) vectors per table), HNSW `ef_search` may need tuning for query latency. Monitor clustering worker query times.
 
 ### 7.4 YouTube Quota Exhaustion — **SEVERITY: MEDIUM**
 
@@ -445,7 +445,7 @@ Workers (100 connections) → PgBouncer (20 connections) → PostgreSQL
 | Embeddings | Local model fallback (e.g., `all-MiniLM-L6-v2`) | Gemini cost > $100/day |
 | YouTube quota | Quota tiering (premium teachers get 50K/day) | Teachers hitting limits daily |
 | Worker autoscaling | Scale on queue depth (>100 tasks → spawn pod) | Sustained backlog |
-| pgvector | Add HNSW index on vector columns | >10K clusters per session |
+| pgvector | Tune HNSW `ef_search` parameter | >100K vectors per table |
 | Redis | Redis Cluster (sharding) | >1GB memory or >50K ops/sec |
 | API | Multiple uvicorn instances behind load balancer | >500 concurrent WebSocket connections |
 
diff --git a/docs/api/rest.md b/docs/api/rest.md
index 6b08815..0a65648 100644
--- a/docs/api/rest.md
+++ b/docs/api/rest.md
@@ -43,14 +43,21 @@
 ### POST /api/v1/auth/login
 **Auth:** Public
 
-**Request:** `application/x-www-form-urlencoded` — `username`, `password`
+**Request:**
+```json
+{
+  "email": "string",
+  "password": "string"
+}
+```
 
 **Response 200:**
 ```json
 {
   "access_token": "string",
   "refresh_token": "string",
-  "token_type": "bearer"
+  "token_type": "bearer",
+  "expires_in": 1800
 }
 ```
 
@@ -63,6 +70,42 @@
 
 ---
 
+### GET /api/v1/auth/me
+**Auth:** Required
+
+**Response 200:** `TeacherResponse`
+
+---
+
+### PATCH /api/v1/auth/profile
+**Auth:** Required
+
+**Request:**
+```json
+{
+  "name": "string"
+}
+```
+
+**Response 200:** `TeacherResponse`
+
+---
+
+### POST /api/v1/auth/change-password
+**Auth:** Required
+
+**Request:**
+```json
+{
+  "current_password": "string",
+  "new_password": "string"
+}
+```
+
+**Response 200:** `{"message": "Password changed successfully"}`
+
+---
+
 ### POST /api/v1/auth/logout
 **Auth:** Bearer access_token
 
@@ -112,9 +155,56 @@
 
 ---
 
+### PATCH /api/v1/sessions/{session_id}
+**Auth:** Required (must own session)
+
+**Request:**
+```json
+{
+  "title": "string | null",
+  "description": "string | null"
+}
+```
+
+**Response 200:** `SessionResponse`
+
+---
+
+### POST /api/v1/sessions/{session_id}/end
+**Auth:** Required (must own session)
+
+**Response 200:** `SessionResponse`
+
+---
+
+### GET /api/v1/sessions/{session_id}/clusters
+**Auth:** Required (must own session)
+
+**Response 200:** `[ClusterResponse]` (includes nested answers)
+
+---
+
+### GET /api/v1/sessions/{session_id}/analytics
+**Auth:** Required (must own session)
+
+**Response 200:**
+```json
+{
+  "total_questions": 0,
+  "total_clusters": 0,
+  "response_rate": 0.0,
+  "avg_cluster_size": 0.0,
+  "peak_hour": "string | null",
+  "questions_over_time": [{"hour": "ISO 8601", "count": 0}],
+  "top_clusters": [{"title": "string", "comment_count": 0}]
+}
+```
+
+---
+
 ## Dashboard Endpoints (`/api/v1/dashboard`)
 
-### POST /api/v1/dashboard/sessions/{session_id}/question
+### POST /api/v1/dashboard/sessions/{session_id}/manual-question
 **Auth:** Required (must own session)
 
 Submit a manual question bypassing YouTube polling.
@@ -139,7 +229,7 @@ Submit a manual question bypassing YouTube polling.
 
 ---
 
-### PUT /api/v1/dashboard/answers/{answer_id}/edit
+### PATCH /api/v1/dashboard/answers/{answer_id}
 **Auth:** Required
 
 **Request:**
@@ -160,10 +250,25 @@ Submit a manual question bypassing YouTube polling.
 ```json
 {
   "total_comments": 0,
-  "total_questions": 0,
-  "total_clusters": 0,
-  "total_answers": 0,
-  "posted_answers": 0
+  "questions": 0,
+  "answered": 0,
+  "clusters": 0,
+  "answers_generated": 0,
+  "answers_posted": 0
+}
+```
+
+---
+
+### GET /api/v1/dashboard/clusters/{cluster_id}/representative
+**Auth:** Required (must own cluster's session)
+
+**Response 200:**
+```json
+{
+  "comment_id": "uuid",
+  "text": "string",
+  "similarity": 0.95
 }
 ```
 
@@ -194,17 +299,191 @@ Returns HTML that postMessages to opener window.
 ---
 
 ### POST /api/v1/youtube/auth/refresh
+**Auth:** Required
+
+Refreshes the YouTube access token using stored refresh token.
+
+**Response 200:** `{"status": "refreshed"}`
+
+---
+
 ### GET /api/v1/youtube/auth/status
+**Auth:** Required
+
+**Response 200:**
+```json
+{
+  "connected": true,
+  "expires_at": "ISO 8601 | null"
+}
+```
+
+---
+
 ### DELETE /api/v1/youtube/auth/disconnect
+**Auth:** Required
+
+**Response 204:** No content
+
+---
+
 ### GET /api/v1/youtube/videos/{video_id}/validate
+**Auth:** Required
 
-<!-- Populate each with request/response shapes -->
+**Response 200:**
+```json
+{
+  "valid": true,
+  "is_live": true,
+  "title": "string"
+}
+```
+
+---
+
+## RAG Document Endpoints (`/api/v1/rag`)
+
+### POST /api/v1/rag/documents
+**Auth:** Required
+
+Upload a PDF, DOCX, or TXT file for RAG retrieval. Multipart form data.
+
+**Response 200:**
+```json
+{
+  "chunks_created": 5,
+  "document_ids": ["uuid", "..."]
+}
+```
+
+---
+
+### GET /api/v1/rag/documents
+**Auth:** Required
+
+**Response 200:** `[{"id": "uuid", "title": "string", "source_type": "string", "created_at": "ISO 8601"}]`
+
+---
+
+### DELETE /api/v1/rag/documents/{doc_id}
+**Auth:** Required (must own document)
+
+**Response 204:** No content
+
+---
+
+## Comment Endpoints (`/api/v1/comments`)
+
+### GET /api/v1/comments/{comment_id}
+**Auth:** Required (must own comment's session)
+
+**Response 200:** `CommentResponse`
+
+---
+
+### PATCH /api/v1/comments/{comment_id}
+**Auth:** Required (must own comment's session)
+
+Marks the comment as answered (`is_answered=True`).
+
+**Response 200:** `CommentResponse`
+
+---
+
+## Cluster Endpoints (`/api/v1/clusters`)
+
+### GET /api/v1/clusters/{cluster_id}
+**Auth:** Required (must own cluster's session)
+
+**Response 200:** `ClusterResponse`
+
+---
+
+### GET /api/v1/clusters/{cluster_id}/comments
+**Auth:** Required (must own cluster's session)
+
+**Query params:** `limit` (default: 50)
+
+**Response 200:** `[CommentResponse]`
+
+---
+
+### PATCH /api/v1/clusters/{cluster_id}
+**Auth:** Required (must own cluster's session)
+
+**Request:**
+```json
+{
+  "title": "string | null",
+  "description": "string | null"
+}
+```
+
+**Response 200:** `ClusterResponse`
+
+---
+
+## Answer Endpoints (`/api/v1/answers`)
+
+### POST /api/v1/answers
+**Auth:** Required
+
+**Request:**
+```json
+{
+  "cluster_id": "uuid",
+  "comment_id": "uuid | null",
+  "text": "string"
+}
+```
+
+**Response 201:** `AnswerResponse`
 
 ---
 
-## Document Endpoints (`/api/v1/documents`)
+### GET /api/v1/answers/{answer_id}
+**Auth:** Required (must own answer's session)
 
-<!-- Populate: upload, list, delete -->
+**Response 200:** `AnswerResponse`
+
+---
+
+### PATCH /api/v1/answers/{answer_id}
+**Auth:** Required
+
+**Request:**
+```json
+{
+  "text": "string"
+}
+```
+
+**Response 200:** `AnswerResponse`
+
+---
+
+### POST /api/v1/answers/{answer_id}/post
+**Auth:** Required
+
+Marks answer as posted (`is_posted=True`, `posted_at=now`).
+
+**Response 200:** `AnswerResponse`
+
+---
+
+## Metrics Endpoint (`/api/v1/metrics`)
+
+### GET /api/v1/metrics
+**Auth:** Required
+
+**Response 200:**
+```json
+{
+  "active_sessions": 0,
+  "questions_processed": 0,
+  "answers_generated": 0
+}
+```
 
 ---
 
diff --git a/docs/api/websocket-events.md b/docs/api/websocket-events.md
index 2ebe929..6a5927c 100644
--- a/docs/api/websocket-events.md
+++ b/docs/api/websocket-events.md
@@ -1,6 +1,6 @@
 # WebSocket Events
 
-> Purpose: All 14 event types with exact JSON payloads and the base envelope.
+> Purpose: All event types with exact JSON payloads and the base envelope.
 
 <!-- Populate from: backend/app/services/websocket/, workers that emit events -->
 <!-- This is the SINGLE source of truth for WS event shapes. -->
@@ -8,10 +8,11 @@
 ## Connection
 
 ```
-ws://{host}/ws/sessions/{session_id}?token={access_token}
+ws://{host}/ws/{session_id}
 ```
 
-**Auth:** Optional `?token=` query param.
+**Auth:** First message must be `{"type": "auth", "token": "<jwt>"}`.
+Optional `?token=` query param also supported.
 - Missing/invalid token: close code `4001`
 - Session not owned by user: close code `4003`
 
@@ -32,16 +33,17 @@ All events follow this envelope:
 
 ## Event Types
 
-### 1. `comment_received`
+### 1. `comment_created`
 Emitted when a new comment (YouTube or manual) is ingested.
 
 ```json
 {
-  "type": "comment_received",
+  "type": "comment_created",
   "session_id": "uuid",
   "timestamp": "...",
   "data": {
     "comment_id": "uuid",
+    "session_id": "uuid",
     "text": "string",
     "author": "string",
     "youtube_comment_id": "string"
@@ -70,26 +72,63 @@ Emitted after classification worker processes a comment.
 ---
 
 ### 3. `comment_embedded`
-<!-- Populate payload -->
+Emitted after embeddings worker generates the comment's vector.
+
+```json
+{
+  "type": "comment_embedded",
+  "session_id": "uuid",
+  "timestamp": "...",
+  "data": {
+    "comment_id": "uuid"
+  }
+}
+```
 
 ---
 
 ### 4. `cluster_created`
-<!-- Populate payload -->
+Emitted when the clustering worker creates a new cluster.
+
+```json
+{
+  "type": "cluster_created",
+  "session_id": "uuid",
+  "timestamp": "...",
+  "data": {
+    "id": "uuid",
+    "title": "string",
+    "comment_count": 1
+  }
+}
+```
 
 ---
 
 ### 5. `cluster_updated`
-<!-- Populate payload: cluster_id, new comment count, updated centroid? -->
+Emitted when a comment joins an existing cluster.
+
+```json
+{
+  "type": "cluster_updated",
+  "session_id": "uuid",
+  "timestamp": "...",
+  "data": {
+    "id": "uuid",
+    "title": "string",
+    "comment_count": 5
+  }
+}
+```
 
 ---
 
-### 6. `answer_generated`
+### 6. `answer_ready`
 Emitted when answer_generation worker creates an Answer record.
 
 ```json
 {
-  "type": "answer_generated",
+  "type": "answer_ready",
   "session_id": "uuid",
   "timestamp": "...",
   "data": {
@@ -103,17 +142,23 @@ Emitted when answer_generation worker creates an Answer record.
 
 ---
 
-### 7. `answer_approved`
-<!-- Populate payload -->
-
----
+### 7. `cluster_summary_failed`
+Emitted when Gemini cluster title summarization fails.
 
-### 8. `answer_edited`
-<!-- Populate payload -->
+```json
+{
+  "type": "cluster_summary_failed",
+  "session_id": "uuid",
+  "timestamp": "...",
+  "data": {
+    "cluster_id": "uuid"
+  }
+}
+```
 
 ---
 
-### 9. `answer_posted`
+### 8. `answer_posted`
 Emitted by youtube_posting worker after successfully posting to YouTube.
 
 ```json
@@ -131,9 +176,12 @@ Emitted by youtube_posting worker after successfully posting to YouTube.
 
 ---
 
-### 10–14. `...`
-<!-- Populate remaining event types from source code -->
+### 9. `quota_alert`
+Emitted when YouTube API quota is running low.
+
+### 10. `quota_exceeded`
+Emitted when YouTube API quota is exhausted.
 
 ## Heartbeat
 
-<!-- Heartbeat event shape and interval, if applicable -->
+The WebSocket connection supports ping/pong for keep-alive. Heartbeat interval: 30 seconds (configurable via `WEBSOCKET_HEARTBEAT_INTERVAL`). Connection timeout: 300 seconds (configurable via `WEBSOCKET_TIMEOUT`).
diff --git a/docs/architecture/data-flow.md b/docs/architecture/data-flow.md
index 8531c00..b4b2272 100644
--- a/docs/architecture/data-flow.md
+++ b/docs/architecture/data-flow.md
@@ -23,13 +23,13 @@ QUEUE_EMBEDDING
         │
         ▼ (5) Embed
 [embeddings worker]
-        │ 1536-dim vector → pgvector
+        │ 768-dim vector → pgvector
         ▼ (6) Enqueue
 QUEUE_CLUSTERING
         │
         ▼ (7) Cluster
 [clustering worker]
-        │ cosine similarity ≥ 0.8 → assign to cluster
+        │ cosine similarity ≥ 0.65 → assign to cluster
         ▼ (8) Enqueue
 QUEUE_ANSWER_GENERATION
         │
@@ -61,10 +61,10 @@ QUEUE_ANSWER_GENERATION
 <!-- Only is_question=True comments proceed to embedding. Threshold configurable. -->
 
 ### Step 5 — Embed
-<!-- Gemini embedding API, 1536-dim, stored as pgvector. See workers/embeddings.md -->
+<!-- Gemini embedding API, 768-dim, stored as pgvector. See workers/embeddings.md -->
 
 ### Step 6–7 — Cluster
-<!-- Cosine similarity at threshold 0.8, centroid update, Cluster CRUD. See workers/clustering.md -->
+<!-- Cosine similarity at threshold 0.65, centroid update, Cluster CRUD. See workers/clustering.md -->
 
 ### Step 8–9a — Generate Answer
 <!-- RAG retrieval from document store, LLM call with context. See workers/answer-generation.md -->
diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md
index be2a0d9..f2fb091 100644
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -6,15 +6,31 @@
 
 ## System Purpose
 
-<!-- What problem does this system solve? Who uses it? -->
+AI Live Doubt Manager helps teachers manage live YouTube teaching sessions at scale. It polls the YouTube live chat in real time, uses Gemini AI to classify and cluster student questions, generates RAG-augmented answers, and delivers them back to the teacher's dashboard over WebSocket — with optional posting directly to the YouTube stream.
+
+**Users:** Teachers running live YouTube educational sessions.
+**Problem:** Live chat during popular sessions becomes an unreadable flood. Genuine student questions get buried. Teachers cannot respond to questions at scale.
 
 ## Component Inventory
 
-<!-- List all components: FastAPI backend, Redis workers, PostgreSQL, React frontend, Chrome extension, YouTube API -->
+| Component | Technology | Location |
+|-----------|-----------|----------|
+| Backend API | FastAPI (Python 3.13), uvicorn | `backend/app/` |
+| PostgreSQL | PostgreSQL 15 + pgvector extension | Docker or native |
+| Redis | Redis 7 | Docker or native |
+| Worker Pipeline | 6 independent Python processes + scheduler | `workers/` |
+| Frontend SPA | React 19, Vite 7, Recharts | `frontend/src/` |
+| Chrome Extension | TypeScript, Manifest V3 | `chrome-extension/` |
+| AI Services | Google Gemini (gemini-2.5-flash, gemini-embedding-001) | Via API |
+| Observability | Prometheus metrics, structured logging | `backend/app/core/metrics.py` |
 
 ## Design Principles
 
-<!-- Core principles guiding architecture decisions -->
+1. **Decouple ingress from processing** — Redis queues separate API/polling from AI work, allowing each stage to fail independently
+2. **Ownership-first data access** — Every query JOINs to StreamingSession and filters `teacher_id == current_user.id`
+3. **Real-time without coupling** — Workers publish to Redis pub/sub; the API relays to WebSocket clients. Workers never hold WS connections
+4. **Online over batch** — Nearest-centroid clustering processes each question as it arrives, no batch re-processing
+5. **Fail to queue, not to void** — Tasks persist in Redis ZSET until consumed; DLQ captures exhausted retries
 
 ## Anti-Duplication Rules
 
diff --git a/docs/architecture/scalability.md b/docs/architecture/scalability.md
index 5a21f01..8e2bbf1 100644
--- a/docs/architecture/scalability.md
+++ b/docs/architecture/scalability.md
@@ -32,7 +32,11 @@ See [infra/deployment.md](../infra/deployment.md) for load balancer configuratio
 
 ## pgvector at Scale
 
-<!-- Indexing strategy for 1536-dim vectors; approximate nearest neighbor options -->
+<!-- Indexing strategy for 768-dim vectors; approximate nearest neighbor options -->
+
+Current indexing: HNSW indexes on all three vector columns (`comments.embedding`, `clusters.centroid_embedding`, `rag_documents.embedding`) using `vector_cosine_ops` with tuning parameters `m=16, ef_construction=64`.
+
+At current scale (hundreds to low thousands of comments per session), HNSW handles queries efficiently. At O(10K+) clusters per session, monitor query latency and consider increasing `ef_search` or partitioning by session.
 
 ## Queue Scaling
 
diff --git a/docs/backend/auth.md b/docs/backend/auth.md
index 4f4f6c2..0e0b307 100644
--- a/docs/backend/auth.md
+++ b/docs/backend/auth.md
@@ -24,7 +24,7 @@ bcrypt with cost factor 12 (`bcrypt_rounds = 12`).
 All protected endpoints use `Depends(get_current_active_user)`, which:
 1. Extracts Bearer token from `Authorization` header
 2. Decodes and validates JWT signature + expiry
-3. Checks token blacklist (Redis key `blacklist:{jti}`)
+3. Checks token blacklist (Redis key `blacklist:token:{hash}`)
 4. Loads user from DB; verifies `is_active=True`
 5. Returns `User` model or raises `401 Unauthorized`
 
@@ -33,7 +33,7 @@ All protected endpoints use `Depends(get_current_active_user)`, which:
 On logout, the token's `jti` claim is written to Redis with TTL matching the remaining
 token lifetime. This ensures invalidated tokens cannot be reused even before expiry.
 
-Redis key pattern: `blacklist:{jti}`
+Redis key pattern: `blacklist:token:{hash}`
 
 ## Refresh Flow
 
diff --git a/docs/backend/overview.md b/docs/backend/overview.md
index 85786cc..6bd62db 100644
--- a/docs/backend/overview.md
+++ b/docs/backend/overview.md
@@ -9,9 +9,9 @@
 `backend/app/main.py` — FastAPI app with lifespan context manager.
 
 ### Lifespan Sequence
-1. <!-- startup steps: DB init, Redis connect, background tasks start -->
-2. `_relay_redis_events()` background task subscribes to `ws:session:{id}` channels
-3. <!-- shutdown steps -->
+1. Clear stale Prometheus multiprocess metric files from `PROMETHEUS_MULTIPROC_DIR`
+2. `manager.start_subscriber()` — background task subscribes to Redis `ws:session:*` channels for WebSocket relay
+3. On shutdown: cancel relay task, log shutdown
 
 ## Router List
 
@@ -21,14 +21,22 @@
 | `/api/v1/sessions` | `backend/app/api/v1/sessions.py` | Session CRUD |
 | `/api/v1/dashboard` | `backend/app/api/v1/dashboard.py` | Manual question, approve, edit, stats |
 | `/api/v1/youtube` | `backend/app/api/v1/youtube.py` | YouTube OAuth + video validation |
-| `/api/v1/documents` | `backend/app/api/v1/documents.py` | RAG document upload/list/delete |
-| <!-- add more --> | | |
+| `/api/v1/rag` | `backend/app/api/v1/rag.py` | RAG document upload/list/delete |
+| `/api/v1/comments` | `backend/app/api/v1/comments.py` | Comment get/mark answered |
+| `/api/v1/clusters` | `backend/app/api/v1/clusters.py` | Cluster get/update/comments |
+| `/api/v1/answers` | `backend/app/api/v1/answers.py` | Answer CRUD + post |
+| `/api/v1/metrics` | `backend/app/api/v1/metrics.py` | Teacher-scoped metrics |
+| `/ws` | `backend/app/api/v1/websocket.py` | WebSocket real-time events |
 
 Full endpoint specs: [api/rest.md](../api/rest.md)
 
 ## Middleware Stack
 
-<!-- List middleware in order: CORS, rate limiting, request context, etc. -->
+Execution order (outermost middleware runs first on request):
+
+1. **RateLimitMiddleware** — Redis-backed IP throttling (60 req/min default). Skips `/health`, `/metrics`, `/docs`.
+2. **RequestContextMiddleware** — Injects `X-Request-ID`, tracks request duration, records Prometheus metrics.
+3. **CORSMiddleware** — Configured origins from `CORS_ORIGINS` setting (includes `:5173` for Vite dev).
 
 ## Key Dependencies
 
diff --git a/docs/backend/services.md b/docs/backend/services.md
index 500e751..fab125a 100644
--- a/docs/backend/services.md
+++ b/docs/backend/services.md
@@ -10,9 +10,10 @@
 
 | Method | Purpose | Used by |
 |--------|---------|---------|
-| `classify_comment()` | Returns `is_question`, `confidence_score` | classification worker |
-| `generate_embedding()` | Returns 1536-dim vector | embeddings worker |
+| `classify_question()` | Returns `is_question`, `confidence_score` | classification worker |
+| `generate_embedding()` | Returns 768-dim vector | embeddings worker |
 | `generate_answer()` | Returns answer text given context | answer_generation worker |
+| `summarize_cluster()` | Returns cluster title summary | clustering worker (at count=3) |
 
 ## RAG Document Service
 
diff --git a/docs/backend/websocket.md b/docs/backend/websocket.md
index 5a67c17..e280266 100644
--- a/docs/backend/websocket.md
+++ b/docs/backend/websocket.md
@@ -15,10 +15,13 @@ Responsibilities:
 
 ## Authentication
 
-WebSocket connections accept an optional `?token=` query parameter.
-- If provided: validated as JWT; session ownership verified
-- Close code `4001`: invalid/expired token
-- Close code `4003`: session not owned by this user
+WebSocket auth uses a first-message pattern. After connecting to `/ws/{session_id}`:
+1. Client sends `{"type": "auth", "token": "<jwt>"}` as the first message
+2. Backend validates JWT signature, checks blacklist, and verifies session ownership
+3. An optional `?token=` query parameter is also supported
+
+- Close code `4001`: invalid/expired token or missing auth message
+- Close code `4003`: valid token but session not owned by this user
 
 ## Heartbeat
 
diff --git a/docs/chrome-extension/build-and-load.md b/docs/chrome-extension/build-and-load.md
index 9d051f9..33d5ff3 100644
--- a/docs/chrome-extension/build-and-load.md
+++ b/docs/chrome-extension/build-and-load.md
@@ -2,6 +2,8 @@
 
 > Purpose: Build steps, load unpacked, dev vs prod backend URL configuration.
 
+> **Status:** The extension is fully working and can be loaded unpacked for development and testing. It is not yet published to the Chrome Web Store.
+
 <!-- Populate from: chrome-extension/package.json, README (if exists) -->
 
 ## Build
diff --git a/docs/chrome-extension/overview.md b/docs/chrome-extension/overview.md
index bb314b8..aa06d4f 100644
--- a/docs/chrome-extension/overview.md
+++ b/docs/chrome-extension/overview.md
@@ -2,6 +2,8 @@
 
 > Purpose: Manifest V3 permissions, background Service Worker lifecycle, alarms keep-alive, and content script.
 
+> **Status:** Fully functional. Not yet published to the Chrome Web Store — currently in testing via "Load unpacked" in developer mode.
+
 <!-- Populate from: chrome-extension/ directory (verify path) -->
 
 ## Stack
diff --git a/docs/data/schema.md b/docs/data/schema.md
index bca8f28..3f09a21 100644
--- a/docs/data/schema.md
+++ b/docs/data/schema.md
@@ -8,27 +8,29 @@
 ## Entity Relationship Overview
 
 ```
-User ──< StreamingSession ──< Comment ──> Cluster
-                                           │
-                                           └──< Answer
-Document (RAG store, standalone)
+Teacher (1) ──▶ (M) StreamingSession (1) ──▶ (M) Comment
+                                       (1) ──▶ (M) Cluster (1) ──▶ (M) Answer
+Comment (M) ──▶ (1) Cluster  [SET NULL on cluster delete]
+Answer (M) ──▶ (1) Comment   [SET NULL on comment delete]
+Teacher (1) ──▶ (1) YouTubeToken
+Teacher (1) ──▶ (M) Quota
+Teacher (1) ──▶ (M) RAGDocument
 ```
 
-<!-- Expand with actual cardinalities and FK references -->
-
 ---
 
-## User
+## Teacher
 
 | Field | Type | Constraints | Notes |
 |-------|------|-------------|-------|
-| `id` | UUID | PK | |
-| `email` | str | UNIQUE, NOT NULL | |
-| `hashed_password` | str | NOT NULL | bcrypt 12 rounds |
-| `name` | str | NOT NULL | |
-| `is_active` | bool | DEFAULT True | |
-| `created_at` | datetime | NOT NULL | |
-| `updated_at` | datetime | NOT NULL | |
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `email` | String(255) | UNIQUE, NOT NULL, indexed | |
+| `name` | String(255) | NOT NULL | |
+| `hashed_password` | String(255) | NOT NULL | bcrypt 12 rounds |
+| `is_active` | Boolean | NOT NULL, DEFAULT True | |
+| `is_verified` | Boolean | NOT NULL, DEFAULT False | |
+| `created_at` | DateTime(tz) | NOT NULL | `datetime.now(timezone.utc)` |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
 
 ---
 
@@ -36,16 +38,18 @@ Document (RAG store, standalone)
 
 | Field | Type | Constraints | Notes |
 |-------|------|-------------|-------|
-| `id` | UUID | PK | |
-| `teacher_id` | UUID | FK → User.id, NOT NULL | ownership check |
-| `title` | str | NOT NULL | |
-| `youtube_video_id` | str | nullable | |
-| `youtube_access_token` | str | nullable | encrypted via `encrypt_data` |
-| `youtube_refresh_token` | str | nullable | encrypted |
-| `youtube_chat_id` | str | nullable | |
-| `is_active` | bool | DEFAULT False | |
-| `created_at` | datetime | NOT NULL | |
-| `last_clustered_at` | datetime | nullable | used by trigger-monitor |
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `teacher_id` | UUID | FK → Teacher.id (CASCADE), NOT NULL, indexed | ownership check |
+| `youtube_video_id` | String(255) | nullable, indexed | optional — sessions can exist without YouTube |
+| `title` | String(500) | nullable | |
+| `description` | Text | nullable | |
+| `is_active` | Boolean | NOT NULL, DEFAULT True | |
+| `started_at` | DateTime(tz) | NOT NULL | `datetime.now(timezone.utc)` |
+| `ended_at` | DateTime(tz) | nullable | set when session ends |
+| `created_at` | DateTime(tz) | NOT NULL | |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
+
+**Indexes:** `idx_session_teacher_active` (teacher_id, is_active), `idx_session_youtube_video` (youtube_video_id)
 
 ---
 
@@ -53,18 +57,22 @@ Document (RAG store, standalone)
 
 | Field | Type | Constraints | Notes |
 |-------|------|-------------|-------|
-| `id` | UUID | PK | |
-| `session_id` | UUID | FK → StreamingSession.id, NOT NULL | |
-| `youtube_comment_id` | str | NOT NULL, UNIQUE | `"manual:{uuid4()}"` for manual |
-| `text` | str | NOT NULL | |
-| `author` | str | nullable | YouTube display name |
-| `is_question` | bool | nullable | set by classification worker |
-| `confidence_score` | float | nullable | 0.0–1.0 |
-| `embedding` | Vector(1536) | nullable | pgvector; set by embeddings worker |
-| `cluster_id` | UUID | FK → Cluster.id, nullable | set by clustering worker |
-| `created_at` | datetime | NOT NULL | |
-
-**Indexes:** `session_id`, `cluster_id`, `youtube_comment_id` (unique)
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `session_id` | UUID | FK → StreamingSession.id (CASCADE), NOT NULL, indexed | |
+| `cluster_id` | UUID | FK → Cluster.id (SET NULL), nullable, indexed | set by clustering worker |
+| `youtube_comment_id` | String(255) | NOT NULL, UNIQUE, indexed | `"manual:{uuid4()}"` for manual comments |
+| `author_name` | String(255) | NOT NULL | YouTube display name |
+| `author_channel_id` | String(255) | nullable | YouTube channel ID |
+| `text` | Text | NOT NULL | |
+| `is_question` | Boolean | NOT NULL, DEFAULT False | set by classification worker |
+| `is_answered` | Boolean | NOT NULL, DEFAULT False | |
+| `confidence_score` | Float | nullable | 0.0–1.0 |
+| `embedding` | Vector(768) | nullable | pgvector; HNSW indexed (cosine_ops, m=16, ef=64) |
+| `published_at` | DateTime(tz) | nullable | YouTube publish timestamp |
+| `created_at` | DateTime(tz) | NOT NULL | |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
+
+**Indexes:** `idx_comment_session_question` (session_id, is_question), `idx_comment_session_answered` (session_id, is_answered), `idx_comment_cluster` (cluster_id), `idx_comments_embedding_hnsw` (embedding, vector_cosine_ops)
 
 ---
 
@@ -72,12 +80,17 @@ Document (RAG store, standalone)
 
 | Field | Type | Constraints | Notes |
 |-------|------|-------------|-------|
-| `id` | UUID | PK | |
-| `session_id` | UUID | FK → StreamingSession.id, NOT NULL | |
-| `centroid` | Vector(1536) | NOT NULL | mean of member embeddings |
-| `representative_text` | str | nullable | text of seed/most-representative comment |
-| `created_at` | datetime | NOT NULL | |
-| `updated_at` | datetime | NOT NULL | |
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `session_id` | UUID | FK → StreamingSession.id (CASCADE), NOT NULL, indexed | |
+| `title` | String(500) | NOT NULL | summarized by Gemini at count=3 |
+| `description` | Text | nullable | |
+| `similarity_threshold` | Float | NOT NULL, DEFAULT 0.8 | per-cluster threshold |
+| `centroid_embedding` | Vector(768) | nullable | HNSW indexed (cosine_ops, m=16, ef=64) |
+| `comment_count` | Integer | NOT NULL, DEFAULT 0 | denormalized for centroid update formula |
+| `created_at` | DateTime(tz) | NOT NULL | |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
+
+**Indexes:** `idx_cluster_session` (session_id), `idx_clusters_centroid_embedding_hnsw` (centroid_embedding, vector_cosine_ops)
 
 ---
 
@@ -85,28 +98,71 @@ Document (RAG store, standalone)
 
 | Field | Type | Constraints | Notes |
 |-------|------|-------------|-------|
-| `id` | UUID | PK | |
-| `cluster_id` | UUID | FK → Cluster.id, NOT NULL | |
-| `session_id` | UUID | FK → StreamingSession.id, NOT NULL | |
-| `text` | str | NOT NULL | generated by LLM |
-| `is_approved` | bool | DEFAULT False | teacher must approve |
-| `is_posted` | bool | DEFAULT False | set by youtube_posting worker |
-| `posted_at` | datetime | nullable | |
-| `created_at` | datetime | NOT NULL | |
-| `updated_at` | datetime | NOT NULL | |
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `cluster_id` | UUID | FK → Cluster.id (CASCADE), NOT NULL, indexed | |
+| `comment_id` | UUID | FK → Comment.id (SET NULL), nullable, indexed | representative comment |
+| `text` | Text | NOT NULL | generated by LLM |
+| `youtube_comment_id` | String(255) | UNIQUE, nullable | YouTube reply ID after posting |
+| `is_posted` | Boolean | NOT NULL, DEFAULT False | set by youtube_posting worker |
+| `posted_at` | DateTime(tz) | nullable | |
+| `created_at` | DateTime(tz) | NOT NULL | |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
+
+**Indexes:** `idx_answer_cluster_posted` (cluster_id, is_posted)
+
+---
+
+## YouTubeToken
+
+| Field | Type | Constraints | Notes |
+|-------|------|-------------|-------|
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `teacher_id` | UUID | FK → Teacher.id (CASCADE), UNIQUE, NOT NULL | one token per teacher |
+| `access_token` | Text | NOT NULL | encrypted at rest via Fernet |
+| `refresh_token` | Text | nullable | encrypted at rest |
+| `token_type` | String(50) | NOT NULL, DEFAULT "Bearer" | |
+| `scope` | Text | nullable | |
+| `expires_at` | DateTime(tz) | NOT NULL | |
+| `created_at` | DateTime(tz) | NOT NULL | |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
+
+---
+
+## Quota
+
+| Field | Type | Constraints | Notes |
+|-------|------|-------------|-------|
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `teacher_id` | UUID | FK → Teacher.id (CASCADE), NOT NULL, indexed | |
+| `quota_type` | String(50) | NOT NULL | e.g., "youtube_poll", "youtube_post", "gemini" |
+| `used` | Integer | NOT NULL, DEFAULT 0 | |
+| `limit` | Integer | NOT NULL | |
+| `period` | String(20) | NOT NULL | "daily" or "monthly" |
+| `reset_at` | DateTime(tz) | NOT NULL | |
+| `created_at` | DateTime(tz) | NOT NULL | |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
+
+**Constraints:** `uq_teacher_quota_period` (teacher_id, quota_type, period)
+**Indexes:** `idx_quota_teacher_type` (teacher_id, quota_type)
 
 ---
 
-## Document (RAG)
+## RAGDocument
 
 | Field | Type | Constraints | Notes |
 |-------|------|-------------|-------|
-| `id` | UUID | PK | |
-| `teacher_id` | UUID | FK → User.id, NOT NULL | |
-| `filename` | str | NOT NULL | |
-| `content_chunk` | str | NOT NULL | extracted text chunk |
-| `embedding` | Vector(1536) | NOT NULL | |
-| `created_at` | datetime | NOT NULL | |
+| `id` | UUID | PK | `default=uuid.uuid4` |
+| `teacher_id` | UUID | FK → Teacher.id (CASCADE), nullable | nullable for potential global docs |
+| `title` | String(500) | NOT NULL | |
+| `content` | Text | NOT NULL | extracted text content |
+| `source_type` | String(50) | nullable | "pdf", "url", "text" |
+| `source_url` | String(1000) | nullable | |
+| `doc_metadata` | JSONB | nullable | custom metadata dict (column name: "metadata") |
+| `embedding` | Vector(768) | nullable | HNSW indexed (cosine_ops, m=16, ef=64) |
+| `created_at` | DateTime(tz) | NOT NULL | |
+| `updated_at` | DateTime(tz) | NOT NULL | auto-updated |
+
+**Indexes:** `idx_rag_source_type` (source_type), `idx_rag_documents_embedding_hnsw` (embedding, vector_cosine_ops)
 
 ---
 
diff --git a/docs/data/shared-contracts.md b/docs/data/shared-contracts.md
index 533facb..c8999a3 100644
--- a/docs/data/shared-contracts.md
+++ b/docs/data/shared-contracts.md
@@ -30,7 +30,7 @@ any TypeScript consumers.
 |--------|---------|
 | `ClassificationPayload` | classification worker (producer: polling), |
 | `EmbeddingPayload` | embeddings worker |
-| `ClusteringPayload` | clustering worker, trigger-monitor |
+| `ClusteringPayload` | clustering worker |
 | `AnswerGenerationPayload` | answer_generation worker |
 | `YouTubePostingPayload` | youtube_posting worker |
 
diff --git a/docs/frontend/api-client.md b/docs/frontend/api-client.md
index 902e87c..f230bda 100644
--- a/docs/frontend/api-client.md
+++ b/docs/frontend/api-client.md
@@ -26,24 +26,31 @@
 | `createSession(data)` | `POST /api/v1/sessions` | |
 | `getSession(id)` | `GET /api/v1/sessions/{id}` | |
 | `getSessionStats(id)` | `GET /api/v1/dashboard/sessions/{id}/stats` | Returns stats object |
-| `submitManualQuestion(sessionId, text)` | `POST /api/v1/dashboard/sessions/{id}/question` | Uses dashboard route |
+| `submitManualQuestion(sessionId, text)` | `POST /api/v1/dashboard/sessions/{id}/manual-question` | Uses dashboard route; max 10 questions per line |
 | `approveAnswer(answerId)` | `POST /api/v1/dashboard/answers/{id}/approve` | **Takes answerId, NOT clusterId** |
-| `editAnswer(answerId, text)` | `PUT /api/v1/dashboard/answers/{id}/edit` | |
+| `editAnswer(answerId, text)` | `PATCH /api/v1/dashboard/answers/{id}` | |
 | `getYouTubeAuthUrl()` | `GET /api/v1/youtube/auth/url` | |
 | `getYouTubeAuthStatus()` | `GET /api/v1/youtube/auth/status` | |
 | `disconnectYouTube()` | `DELETE /api/v1/youtube/auth/disconnect` | |
 | `validateYouTubeVideo(videoId)` | `GET /api/v1/youtube/videos/{id}/validate` | |
-| `uploadDocument(formData)` | `POST /api/v1/documents` | multipart/form-data |
-| `getDocuments()` | `GET /api/v1/documents` | |
-| `deleteDocument(id)` | `DELETE /api/v1/documents/{id}` | |
-| `updateProfile(data)` | `PUT /api/v1/auth/profile` (verify path) | |
-| `changePassword(data)` | `PUT /api/v1/auth/password` (verify path) | |
+| `uploadDocument(sessionId, file, token, onProgress)` | `POST /api/v1/rag/documents` | XHR-based with progress callback; multipart/form-data |
+| `getDocuments({token, sessionId?})` | `GET /api/v1/rag/documents` | |
+| `deleteDocument(documentId, token)` | `DELETE /api/v1/rag/documents/{id}` | |
+| `updateProfile(data, token)` | `PATCH /api/v1/auth/profile` | |
+| `changePassword(currentPassword, newPassword, token)` | `POST /api/v1/auth/change-password` | |
+| `getSessionComments(sessionId, token, limit, offset)` | `GET /api/v1/sessions/{id}/comments?limit=&offset=` | Paginated; default limit=100 |
+| `getSessionClusters(sessionId, token)` | `GET /api/v1/sessions/{id}/clusters` | |
+| `getSessionAnalytics(sessionId, token)` | `GET /api/v1/sessions/{id}/analytics` | |
+| `endSession(id, token)` | `POST /api/v1/sessions/{id}/end` | |
+| `getClusterComments(clusterId, token)` | `GET /api/v1/clusters/{id}/comments` | |
+| `getRepresentativeQuestion(clusterId, token)` | `GET /api/v1/dashboard/clusters/{id}/representative` | |
+| `refreshAccessToken()` | `POST /api/v1/auth/refresh` | Deduped via refreshPromise |
 
 ## Critical Quirks
 
 - `approveAnswer(answerId)` — takes `answer_id`, **not** `cluster_id`. This is a common
-  mistake. The answer ID is found in the `answer_generated` WebSocket event's `data.answer_id`.
-- `submitManualQuestion` — uses the dashboard route (`/api/v1/dashboard/sessions/{id}/question`),
+  mistake. The answer ID is found in the `answer_ready` WebSocket event's `data.answer_id`.
+- `submitManualQuestion` — uses the dashboard route (`/api/v1/dashboard/sessions/{id}/manual-question`),
   not a generic comments endpoint.
 - Token injection — `token` must be in the `useAuth` dependency array of any `useEffect`
   that calls API functions, or the stale closure will use the old token.
diff --git a/docs/frontend/components.md b/docs/frontend/components.md
index b0536b8..85722f8 100644
--- a/docs/frontend/components.md
+++ b/docs/frontend/components.md
@@ -8,8 +8,11 @@
 
 ### LandingPage
 `frontend/src/pages/LandingPage.jsx`
-- **Renders:** Marketing/landing content, links to login/register
-- **Data:** None
+- **Renders:** Dark-themed landing page with orange accent (#FF6B35). Header with logo + nav, hero section with tagline and CTA buttons, "How It Works" section with 3 numbered feature cards (Connect, AI Processes, Review & Post), footer with tech attribution
+- **Fonts:** Azeret Mono (display), Outfit (body)
+- **Animations:** CSS entrance animations (anim-nav, anim-1 through anim-4)
+- **Behavior:** Redirects authenticated users to `/dashboard`
+- **Data:** None (static content)
 
 ### LoginPage
 `frontend/src/pages/LoginPage.jsx`
@@ -51,7 +54,8 @@
 ### ManualInput
 `frontend/src/components/Dashboard/ManualInput.jsx`
 - **Renders:** Text input for manually submitting a question
-- **Data:** `submitManualQuestion()` → `POST /api/v1/dashboard/sessions/{id}/question`
+- **Data:** `submitManualQuestion()` → `POST /api/v1/dashboard/sessions/{id}/manual-question`
+- **Limits:** Max 10 questions per submission (one per line), text 1-5000 chars
 - **Critical quirks:** Uses dashboard route, not a generic comments route
 
 ### MetricsCards
@@ -63,30 +67,70 @@
 ### QuestionsFeed
 `frontend/src/components/Dashboard/QuestionsFeed.jsx`
 - **Renders:** Real-time feed of classified questions
-- **Data:** Initial load + `comment_received`, `comment_classified` WS events
-- **Events:** `comment_received`, `comment_classified`
+- **Data:** Initial load + `comment_created`, `comment_classified` WS events
+- **Events:** `comment_created`, `comment_classified`
 
 ### ClustersPanel
 `frontend/src/components/Dashboard/ClustersPanel.jsx`
 - **Renders:** Clusters with associated questions and generated answers
-- **Data:** `cluster_created`, `cluster_updated`, `answer_generated` WS events
-- **Events:** `cluster_created`, `cluster_updated`, `answer_generated`, `answer_posted`
+- **Data:** `cluster_created`, `cluster_updated`, `answer_ready` WS events
+- **Events:** `cluster_created`, `cluster_updated`, `answer_ready`, `answer_posted`
 
 ### ActivityLog
 `frontend/src/components/Dashboard/ActivityLog.jsx`
-- **Renders:** <!-- describe -->
-- **Data:** <!-- describe -->
+- **Renders:** Chronological feed of the last 20 events (newest first), each with emoji icon, label, and relative timestamp
+- **Event types:** comment_created, comment_classified, cluster_created, cluster_updated, answer_ready, answer_posted, session_started, session_ended, quota_alert, quota_exceeded
+- **Data:** WebSocket messages from `useWebSocket` hook
 
 ### AnalyticsPanel
 `frontend/src/components/Dashboard/AnalyticsPanel.jsx`
-- **Renders:** <!-- describe -->
-- **Data:** <!-- describe -->
+- **Renders:** Stats grid (total questions, clusters answered %, avg cluster size, peak hour), cumulative line chart (Questions Over Time via Recharts), hourly bar chart, top topics ranked list, export buttons (CSV and JSON)
+- **Data:** `GET /api/v1/sessions/{id}/analytics` — debounced refetch on WebSocket events (2000ms)
+- **Export:** CSV (Questions, Answers, Cluster, Timestamp, Is Posted) and JSON (detailed cluster objects)
+
+### DocumentUpload
+`frontend/src/components/Dashboard/DocumentUpload.jsx`
+- **Renders:** Collapsible section with file input (.pdf, .docx, .txt, max 10MB), upload progress bar, list of uploaded documents with delete buttons
+- **Data:** `POST /api/v1/rag/documents` (XHR-based with progress), `GET /api/v1/rag/documents`, `DELETE /api/v1/rag/documents/{id}`
+
+### QuotaBanner
+`frontend/src/components/Dashboard/QuotaBanner.jsx`
+- **Renders:** Alert bar at top of dashboard when YouTube quota is running low (warning) or exhausted (critical). Dismissible.
+- **Data:** `quota_alert` and `quota_exceeded` WebSocket events
+
+### ClusterDetailsModal
+`frontend/src/components/Dashboard/ClusterDetailsModal.jsx`
+- **Renders:** Modal overlay showing cluster title, all comments in the cluster, and generated answer. Click outside closes.
+- **Data:** `GET /api/v1/clusters/{id}/comments`
+
+### KeyboardShortcutsModal
+`frontend/src/components/Dashboard/KeyboardShortcutsModal.jsx`
+- **Renders:** Modal with table of keyboard shortcuts (? = help, N = new session, A = approve first pending, Ctrl+K = focus input, Esc = close)
+- **Data:** None (static content)
 
 ---
 
 ## Auth / Layout Components
 
-<!-- Document AuthGuard, NavBar, etc. if they exist -->
+### Header
+`frontend/src/components/Layout/Header.jsx`
+- **Renders:** Logo, active session name with "LIVE" badge, connection status dot (connected/connecting/reconnecting), user name, theme toggle, Settings link, Logout button
+
+### ProtectedRoute
+`frontend/src/components/Layout/ProtectedRoute.jsx`
+- **Renders:** Wraps children; redirects to `/login` if not authenticated
+
+### LoginForm / RegisterForm
+`frontend/src/components/Auth/LoginForm.jsx`, `RegisterForm.jsx`
+- **Renders:** Email + password form (RegisterForm adds name field), error display, loading state, link to alternate auth page
+
+### ErrorBoundary
+`frontend/src/components/ErrorBoundary.jsx`
+- **Renders:** Catches React render errors, shows error details + Retry button (reloads page)
+
+### GlobalShortcutsHandler
+`frontend/src/components/GlobalShortcutsHandler.jsx`
+- **Renders:** Invisible — listens for `?` (show shortcuts modal) and `Esc` (close modal). Only active when authenticated.
 
 ---
 
@@ -94,11 +138,13 @@
 
 | Event | Consumed by |
 |-------|-------------|
-| `comment_received` | QuestionsFeed |
-| `comment_classified` | QuestionsFeed |
-| `cluster_created` | ClustersPanel |
-| `cluster_updated` | ClustersPanel |
-| `answer_generated` | ClustersPanel |
-| `answer_posted` | ClustersPanel, MetricsCards |
+| `comment_created` | QuestionsFeed, MetricsCards, ActivityLog |
+| `comment_classified` | QuestionsFeed, MetricsCards, ActivityLog |
+| `cluster_created` | ClustersPanel, MetricsCards, ActivityLog |
+| `cluster_updated` | ClustersPanel, ActivityLog |
+| `answer_ready` | ClustersPanel, MetricsCards, ActivityLog |
+| `answer_posted` | ClustersPanel, MetricsCards, ActivityLog |
+| `quota_alert` | QuotaBanner, ActivityLog |
+| `quota_exceeded` | QuotaBanner, ActivityLog |
 
 For full event payload shapes, see [api/websocket-events.md](../api/websocket-events.md).
diff --git a/docs/frontend/overview.md b/docs/frontend/overview.md
index 0d35f2f..a174ce0 100644
--- a/docs/frontend/overview.md
+++ b/docs/frontend/overview.md
@@ -43,7 +43,7 @@ Provides:
 - `logout()` — clears tokens, calls logout API
 - `refreshToken()` — auto-refreshes on 401
 
-Token storage: <!-- localStorage? sessionStorage? httpOnly cookie? -->
+Token storage: `localStorage` — keys `token` and `refreshToken`. Required for WebSocket auth (first-message JWT pattern).
 
 Hook: `useAuth()` from `frontend/src/hooks/useAuth.js`
 
diff --git a/docs/frontend/state-and-hooks.md b/docs/frontend/state-and-hooks.md
index 637e54f..d18e588 100644
--- a/docs/frontend/state-and-hooks.md
+++ b/docs/frontend/state-and-hooks.md
@@ -12,12 +12,15 @@
 
 ```js
 const {
-  user,           // { id, email, name } | null
   token,          // string | null — JWT access token
-  isLoading,      // bool — true during initial auth check
+  displayName,    // string | null — user's display name
+  userEmail,      // string | null
+  userName,       // string | null
+  isAuthenticated,// bool
   login,          // async (email, password) → void (throws on error)
   logout,         // async () → void
-  refreshToken,   // async () → string — new access token
+  register,       // async (email, password, name) → void
+  updateProfile,  // async (name) → void
 } = useAuth()
 ```
 
@@ -35,21 +38,23 @@ Must be used inside `<AuthProvider>`. Throws if called outside context.
 
 ```js
 const {
-  messages,       // array — last 100 events (capped)
-  isConnected,    // bool
-  lastEvent,      // most recent event object | null
+  messages,       // array — last 100 events (capped, oldest dropped)
+  connected,      // bool
+  reconnecting,   // bool — true during reconnection attempts
 } = useWebSocket(sessionId, token)
 ```
 
 ### Behavior
 
 - **Reconnection:** Exponential backoff on disconnect
-  - Initial delay: <!-- populate -->
-  - Max delay: <!-- populate -->
-  - Gives up after: <!-- populate attempts, or indefinite? -->
+  - Initial delay: 1000ms
+  - Max delay: 30000ms (capped)
+  - Formula: `Math.min(1000 * 2^retry_count, 30000)`
+  - Gives up after: 10 retries (MAX_RETRIES=10)
+  - Auth failures (4001, 4003) do NOT retry
 - **Message cap:** Caps at 100 messages (oldest dropped on overflow)
   - Prevents memory growth during long sessions
-- **Auth:** Passes `?token={token}` in WebSocket URL
+- **Auth:** Sends `{"type": "auth", "token": "<jwt>"}` as first message after connection opens
 - **Cleanup:** Closes connection on component unmount or sessionId change
 
 ### Event Handling
@@ -61,17 +66,36 @@ to `lastEvent` or filter `messages` by `type`.
 
 ## useToast
 
-`frontend/src/hooks/useToast.js` (verify path)
+`frontend/src/hooks/useToast.js`
 
-<!-- Populate: toast API contract -->
+### Contract
+
+```js
+import { showToast } from '../hooks/useToast'
+
+showToast('Operation succeeded', 'success')  // types: 'info' | 'error' | 'success'
+```
+
+Global function — can be called from anywhere (no Provider needed). `ToastContainer` subscribes on mount. Auto-dismisses after 4000ms.
 
 ---
 
 ## useKeyboardShortcuts
 
-`frontend/src/hooks/useKeyboardShortcuts.js` (verify path)
+`frontend/src/hooks/useKeyboardShortcuts.js`
+
+### Contract
+
+```js
+useKeyboardShortcuts({
+  onNewSession,   // called on 'N' key
+  onApproveFirst, // called on 'A' key
+  onFocusSearch,  // called on Ctrl+K / Cmd+K
+  enabled: true,  // disable when modal open, etc.
+})
+```
 
-<!-- Populate: keyboard shortcut bindings -->
+Ignores shortcuts when active element is INPUT, TEXTAREA, SELECT, or contentEditable. `Ctrl+K` / `Cmd+K` works even while typing (preventDefault).
 
 ---
 
@@ -79,7 +103,8 @@ to `lastEvent` or filter `messages` by `type`.
 
 `frontend/src/context/AuthContext.jsx`
 
-Manages token storage and auto-refresh. Token is stored in:
-<!-- localStorage key? sessionStorage? -->
+Manages token storage and auto-refresh.
 
-Auto-refresh: <!-- on 401 response? on timer? -->
+- **Storage:** `localStorage` keys: `token`, `refreshToken`
+- **Auto-refresh:** On mount, checks stored token expiry (JWT decode). If expired, attempts refresh using stored refresh token. If refresh fails, logs out and redirects to `/login`.
+- **Cross-tab:** ThemeContext (separate) syncs dark/light mode across tabs via `storage` events.
diff --git a/docs/infra/configuration-reference.md b/docs/infra/configuration-reference.md
index 24c5fda..48db693 100644
--- a/docs/infra/configuration-reference.md
+++ b/docs/infra/configuration-reference.md
@@ -23,9 +23,27 @@
 | `CORS_ORIGINS` | str (comma-separated) | `http://localhost:5173,http://localhost:8080` | No | backend | |
 | `LOG_LEVEL` | str | `INFO` | No | backend, workers | |
 | `LOG_JSON` | bool | `False` | No | backend | Structured JSON logging |
-| `ENCRYPTION_KEY` | str | — | Yes | backend | For `encrypt_data`/`decrypt_data` |
-| `YOUTUBE_POLLING_INTERVAL` | int | <!-- N --> | No | workers | Seconds between polls |
-| <!-- add remaining --> | | | | | |
+| `ENCRYPTION_KEY` | str | — | Yes | backend | ≥32 chars; Fernet encryption for OAuth tokens |
+| `ENVIRONMENT` | str | `development` | No | backend | "development", "staging", "production" |
+| `DEBUG` | bool | `False` | No | backend | |
+| `RATE_LIMIT_REQUESTS_PER_MINUTE` | int | `60` | No | backend | Per-IP rate limiting |
+| `RATE_LIMIT_ENABLED` | bool | `True` | No | backend | |
+| `PASSWORD_BCRYPT_ROUNDS` | int | `12` | No | backend | |
+| `GEMINI_MODEL` | str | `gemini-2.5-flash` | No | workers | Classification + answer gen model |
+| `GEMINI_EMBEDDING_MODEL` | str | `gemini-embedding-001` | No | workers | Embedding model |
+| `CLASSIFICATION_CONFIDENCE_THRESHOLD` | float | `0.4` | No | workers | Min confidence for embedding queue |
+| `CLUSTERING_SIMILARITY_THRESHOLD` | float | `0.65` | No | workers | pgvector cosine similarity cutoff |
+| `DEFAULT_DAILY_ANSWER_LIMIT` | int | `100` | No | backend | Per-teacher daily answer limit |
+| `DEFAULT_MONTHLY_SESSION_LIMIT` | int | `30` | No | backend | Per-teacher monthly session limit |
+| `ENABLE_METRICS` | bool | `True` | No | backend, workers | Prometheus metrics |
+| `METRICS_PORT` | int | `9090` | No | backend | |
+| `WEBSOCKET_HEARTBEAT_INTERVAL` | int | `30` | No | backend | Seconds between heartbeats |
+| `WEBSOCKET_TIMEOUT` | int | `300` | No | backend | Connection timeout in seconds |
+| `MOCK_YOUTUBE` | bool | `False` | No | workers | Use mock YouTube polling |
+| `DATABASE_POOL_SIZE` | int | `5` | No | backend | SQLAlchemy pool size |
+| `DATABASE_MAX_OVERFLOW` | int | `10` | No | backend | SQLAlchemy max overflow |
+| `REDIS_MAX_CONNECTIONS` | int | `10` | No | backend, workers | |
+| `PROMETHEUS_MULTIPROC_DIR` | str | `/tmp/prometheus_multiproc` | No | backend, workers | Multiprocess metrics dir |
 
 ## Notes
 
diff --git a/docs/infra/deployment.md b/docs/infra/deployment.md
index 7236f42..390883a 100644
--- a/docs/infra/deployment.md
+++ b/docs/infra/deployment.md
@@ -31,8 +31,8 @@ FastAPI will:
 
 | Service | Dockerfile | Image |
 |---------|-----------|-------|
-| Backend | `backend/Dockerfile` | `ai-doubt-manager-backend` |
-| Workers | `workers/Dockerfile` | `ai-doubt-manager-workers` |
+| Backend | `infra/docker/api.Dockerfile` | `ai-doubt-manager-backend` |
+| Workers | `infra/docker/worker.Dockerfile` | `ai-doubt-manager-workers` |
 
 ## Terraform
 
diff --git a/docs/infra/local-dev.md b/docs/infra/local-dev.md
index b4af91a..8668dc2 100644
--- a/docs/infra/local-dev.md
+++ b/docs/infra/local-dev.md
@@ -7,7 +7,7 @@
 ## Prerequisites
 
 - Docker + Docker Compose
-- Python 3.11+
+- Python 3.13+
 - Node.js 20+
 - Redis (or via Docker)
 - PostgreSQL (or via Docker)
@@ -17,7 +17,7 @@
 ```yaml
 services:
   postgres:
-    image: pgvector/pgvector:pg16
+    image: pgvector/pgvector:pg15
     ports: ["5432:5432"]
     volumes: [postgres_data:/var/lib/postgresql/data]
 
@@ -42,14 +42,8 @@ DATABASE_URL=postgresql://user:password@localhost:5432/ai_doubt_manager_dev alem
 cd backend
 uvicorn app.main:app --reload --port 8000
 
-# Start workers (each in a separate terminal)
-cd workers && python runner.py classification
-cd workers && python runner.py embeddings
-cd workers && python runner.py clustering
-cd workers && python runner.py answer_generation
-cd workers && python runner.py youtube_polling
-cd workers && python runner.py youtube_posting
-cd workers && python runner.py trigger_monitor
+# Start ALL services (backend + 6 workers + scheduler + frontend) via tmux
+./start_dev.sh
 
 # Start frontend
 cd frontend
@@ -69,15 +63,17 @@ npm run dev   # → http://localhost:5173
 
 ## Makefile Commands
 
-<!-- Populate from Makefile -->
-
 ```bash
-make up          # Start all Docker services
-make down        # Stop all Docker services
-make migrate     # Run Alembic migrations
-make test        # Run test suite
-make lint        # Run linters
-# Add more as defined
+make run-backend   # uvicorn app.main:app --reload on :8000
+make format        # Black + isort (line-length=119)
+make lint          # Ruff + flake8 + pylint
+make test          # pytest backend/tests workers -v
+make test-coverage # Coverage report to HTML + terminal
+make migrate       # alembic upgrade head
+make migration MSG="..." # alembic revision --autogenerate
+make downgrade     # alembic downgrade -1
+make install       # pip install requirements for backend + workers
+make clean         # Remove __pycache__, .pyc, .egg-info, .pytest_cache
 ```
 
 ## Environment Files
diff --git a/docs/observability/logging.md b/docs/observability/logging.md
index e1a315d..10f2606 100644
--- a/docs/observability/logging.md
+++ b/docs/observability/logging.md
@@ -37,11 +37,12 @@ Enable via `LOG_JSON=true` env var (recommended in production).
 
 ## RequestContextMiddleware
 
-`backend/app/middleware/` (verify path)
+`backend/app/core/middleware.py`
 
 - Assigns a unique `request_id` (UUID) to each incoming HTTP request
 - Injects `request_id` into log context for the duration of the request
 - Returns `X-Request-ID` response header
+- Returns `X-Process-Time` response header (request duration)
 - Workers inherit a `task_id` for equivalent correlation
 
 ## Worker Logging
diff --git a/docs/observability/metrics.md b/docs/observability/metrics.md
index 4b75821..f6d78af 100644
--- a/docs/observability/metrics.md
+++ b/docs/observability/metrics.md
@@ -27,30 +27,88 @@ Metrics are exposed at: `GET /metrics` (Prometheus text format)
 
 ---
 
-### 3. <!-- metric name -->
-- **Type:** <!-- Counter | Gauge | Histogram | Summary -->
-- **Labels:** <!-- label names -->
-- **Description:** <!-- what does it measure -->
-- **PromQL:** <!-- example query -->
+### 3. `websocket_connections_active`
+- **Type:** Gauge (multiprocess liveall mode)
+- **Labels:** `session_id`
+- **Description:** Currently active WebSocket connections per session
+- **PromQL:** `sum(websocket_connections_active)`
 
 ---
 
-### 4. <!-- metric name -->
-<!-- Repeat for all 6 metrics -->
+### 4. `websocket_messages_total`
+- **Type:** Counter
+- **Labels:** `type`, `direction`
+- **Description:** Total WebSocket messages sent/received
+- **PromQL:** `rate(websocket_messages_total[5m])`
+
+---
+
+### 5. `queue_size`
+- **Type:** Gauge (multiprocess liveall mode)
+- **Labels:** `queue_name`
+- **Description:** Current number of items in each Redis queue
+- **PromQL:** `queue_size{queue_name="classification"}`
+
+---
+
+### 6. `queue_processed_total`
+- **Type:** Counter
+- **Labels:** `queue_name`, `status`
+- **Description:** Total items processed per queue (success/failure)
+- **PromQL:** `rate(queue_processed_total{status="success"}[5m])`
+
+---
+
+### 7. `database_queries_total`
+- **Type:** Counter
+- **Labels:** `operation`, `table`
+- **Description:** Total database queries by operation type and table
+- **PromQL:** `rate(database_queries_total[5m])`
+
+---
+
+### 8. `database_query_duration_seconds`
+- **Type:** Histogram
+- **Labels:** `operation`, `table`
+- **Description:** Database query latency in seconds
+- **PromQL:** `histogram_quantile(0.95, rate(database_query_duration_seconds_bucket[5m]))`
 
 ---
 
-### 5. <!-- metric name -->
+### 9. `redis_operations_total`
+- **Type:** Counter
+- **Labels:** `operation`
+- **Description:** Total Redis operations
+- **PromQL:** `rate(redis_operations_total[5m])`
 
 ---
 
-### 6. <!-- metric name -->
+### 10. `worker_heartbeat`
+- **Type:** Gauge (multiprocess liveall mode)
+- **Labels:** `worker_name`
+- **Description:** Last heartbeat timestamp per worker
+- **PromQL:** `time() - worker_heartbeat`
 
 ---
 
+### 11. `quota_usage` / `quota_limit`
+- **Type:** Gauge (multiprocess liveall mode)
+- **Labels:** `teacher_id`, `quota_type`
+- **Description:** Current quota usage and limits per teacher
+- **PromQL:** `quota_usage / quota_limit`
+
 ## Worker Queue Metrics
 
-<!-- Document any queue-depth or worker processing metrics if exported -->
+Workers export additional metrics via `workers/common/metrics.py`:
+
+| Metric | Type | Labels | Description |
+|--------|------|--------|-------------|
+| `worker_items_processed_total` | Counter | `worker_name` | Items processed per worker |
+| `worker_processing_duration_seconds` | Histogram | `worker_name` | Processing time per item |
+| `worker_errors_total` | Counter | `worker_name` | Errors per worker |
+| `gemini_circuit_state` | Gauge | `worker_name` | Circuit breaker state (0=closed, 1=half_open, 2=open) |
+
+Queue depths are updated by `update_queue_depths()` which polls all 6 queues and updates the `queue_depth` gauge.
 
 ## Alerting Rules
 
diff --git a/docs/security/secrets-management.md b/docs/security/secrets-management.md
index abe7484..f205ff8 100644
--- a/docs/security/secrets-management.md
+++ b/docs/security/secrets-management.md
@@ -19,8 +19,8 @@
 `backend/app/core/encryption.py`
 
 Used to encrypt sensitive values before storing in PostgreSQL:
-- `StreamingSession.youtube_access_token`
-- `StreamingSession.youtube_refresh_token`
+- `YouTubeToken.access_token`
+- `YouTubeToken.refresh_token`
 
 ```python
 from backend.app.core.encryption import encrypt_data, decrypt_data
@@ -29,14 +29,14 @@ encrypted = encrypt_data(plaintext_token)   # stores in DB
 decrypted = decrypt_data(encrypted_token)   # retrieved from DB for API calls
 ```
 
-Algorithm: <!-- Fernet? AES-GCM? Populate from source -->
-Key source: `ENCRYPTION_KEY` environment variable
+Algorithm: Fernet (symmetric encryption via `cryptography.fernet.Fernet`)
+Key derivation: First 32 bytes of `ENCRYPTION_KEY` env var, base64-encoded for Fernet. Config validator ensures key is ≥32 characters.
 
 ## Rotating ENCRYPTION_KEY
 
 If the encryption key must be rotated:
 1. Generate new key
-2. For each `StreamingSession` with stored tokens:
+2. For each `YouTubeToken` with stored tokens:
    - Decrypt with old key
    - Re-encrypt with new key
    - Save
diff --git a/docs/security/youtube-oauth.md b/docs/security/youtube-oauth.md
index de829d4..5edb771 100644
--- a/docs/security/youtube-oauth.md
+++ b/docs/security/youtube-oauth.md
@@ -64,10 +64,12 @@ The dashboard listens for this message to update YouTube connection status.
 ## Token Storage
 
 YouTube access and refresh tokens are:
-1. Encrypted via `encrypt_data` (`backend/app/core/encryption.py`)
-2. Stored in `StreamingSession.youtube_access_token` / `youtube_refresh_token`
+1. Encrypted via `encrypt_data` (`backend/app/core/encryption.py`) using Fernet symmetric encryption
+2. Stored in the `YouTubeToken` model (one per teacher, `teacher_id` UNIQUE constraint)
+3. Fields: `access_token`, `refresh_token`, `token_type`, `scope`, `expires_at`
 
 See [security/secrets-management.md](secrets-management.md) for encryption details.
+See [data/schema.md](../data/schema.md) for the YouTubeToken model definition.
 
 ## HTTPS Requirement
 
diff --git a/docs/state/known-issues.md b/docs/state/known-issues.md
index 0b838ab..dc463bb 100644
--- a/docs/state/known-issues.md
+++ b/docs/state/known-issues.md
@@ -1,7 +1,7 @@
 # Known Issues
 
 > **LIVING DOCUMENT** — Update when bugs are found or resolved.
-> Last updated: 2026-03-01
+> Last updated: 2026-03-18
 
 ## Active Bugs
 
@@ -15,7 +15,7 @@
 
 | ID | Area | Description | Impact |
 |----|------|-------------|--------|
-| TD-001 | Frontend | ActivityLog and AnalyticsPanel components are stubs with unclear purpose | Low — dashboard renders without them |
+| TD-001 | Frontend | ~~ActivityLog and AnalyticsPanel stubs~~ **Resolved** — both are fully implemented | — |
 | TD-002 | Workers | No built-in monitoring UI for queue depths or DLQ contents | Medium — must use redis-cli to inspect |
 | TD-003 | Migrations | Some migrations written manually (no Docker in dev environment) | Low — documented in data/migrations.md |
 | TD-004 | Tests | <!-- Coverage of workers/youtube integration --> | <!-- impact --> |
diff --git a/docs/state/phase-status.md b/docs/state/phase-status.md
index c7f0746..8a775c1 100644
--- a/docs/state/phase-status.md
+++ b/docs/state/phase-status.md
@@ -1,7 +1,7 @@
 # Phase Status
 
 > **LIVING DOCUMENT** — Update this file as phases complete or priorities change.
-> Last updated: 2026-03-01
+> Last updated: 2026-03-18
 
 ## Phase Summary
 
@@ -26,8 +26,8 @@
 **Delivered:**
 - Replaced OpenAI with Google Gemini
 - Classification worker (is_question + confidence_score)
-- Embeddings worker (1536-dim pgvector)
-- Clustering worker (cosine similarity at 0.8)
+- Embeddings worker (768-dim pgvector)
+- Clustering worker (cosine similarity at 0.65)
 - Answer generation worker (RAG + LLM)
 - Human approval flow (is_posted=False by default)
 
@@ -54,12 +54,18 @@
 - Dashboard components: SessionList, YouTubePanel, ManualInput, MetricsCards, QuestionsFeed, ClustersPanel, ActivityLog, AnalyticsPanel
 - useWebSocket hook (exponential backoff, 100-msg cap)
 - Vite proxy: `/api` → `:8000`, `/ws` → ws://`:8000`
+- DocumentUpload component (PDF/DOCX/TXT upload for RAG)
+- QuotaBanner component (YouTube quota alerts)
+- ClusterDetailsModal component
+- KeyboardShortcutsModal (?, N, A, Ctrl+K shortcuts)
+- ThemeContext (dark/light mode with localStorage + cross-tab sync)
+- Dark theme with orange accent (#FF6B35), Azeret Mono + Outfit fonts
+- SettingsPage with ProfileSection, PasswordSection, PreferencesSection
 
 ## Current Stubs / Incomplete Items
 
-<!-- Items started but not finished -->
-- ActivityLog component — purpose unclear, needs population
-- AnalyticsPanel component — purpose unclear, needs population
+- `workers/runner.py` — unimplemented worker orchestrator. Use `./start_dev.sh` instead.
+- `workers/trigger_monitor/worker.py` — empty infinite loop, not started by `start_dev.sh`.
 
 ## Next Priorities
 
diff --git a/docs/workers/answer-generation.md b/docs/workers/answer-generation.md
index 3855aad..5de1f00 100644
--- a/docs/workers/answer-generation.md
+++ b/docs/workers/answer-generation.md
@@ -11,7 +11,8 @@
 ```json
 {
   "cluster_id": "uuid",
-  "session_id": "uuid"
+  "session_id": "uuid",
+  "question_texts": ["string", "..."]
 }
 ```
 
@@ -27,7 +28,10 @@
 
 ## RAG Retrieval
 
-<!-- How many chunks retrieved? What similarity threshold? What prompt template? -->
+- Retrieves top 5 nearest RAG documents by centroid embedding (cosine distance)
+- Documents are scoped to the teacher who owns the session (teacher_id filter)
+- Two-prompt system: with-context prompt when documents exist, without-context when none available
+- Answer is moderated via ModerationService before saving
 
 ## Answer Record
 
@@ -36,7 +40,7 @@
 | `Answer.cluster_id` | FK to cluster |
 | `Answer.text` | Generated answer text |
 | `Answer.is_posted` | `False` (pending approval) |
-| `Answer.is_approved` | `False` (pending teacher action) |
+| `Answer.posted_at` | `null` (set when posted) |
 
 For field definitions see [data/schema.md](../data/schema.md).
 
@@ -44,7 +48,7 @@ For field definitions see [data/schema.md](../data/schema.md).
 
 After this worker creates the `Answer` record:
 1. WebSocket event notifies the teacher dashboard
-2. Teacher reviews, optionally edits, then approves via `POST /api/v1/dashboard/approve/{answer_id}`
+2. Teacher reviews, optionally edits, then approves via `POST /api/v1/dashboard/answers/{answer_id}/approve`
 3. youtube_posting worker posts the approved answer
 
 See [ADR-004](../architecture/decisions/ADR-004-rag-design.md) for the rationale.
diff --git a/docs/workers/classification.md b/docs/workers/classification.md
index e7d3115..ddab551 100644
--- a/docs/workers/classification.md
+++ b/docs/workers/classification.md
@@ -18,10 +18,11 @@
 
 ## Processing
 
-1. Call `GeminiClient.classify_comment(text)`
+1. Call `GeminiClient.classify_question(text)`
 2. Update `Comment` record: `is_question`, `confidence_score`
 3. If `is_question=True`: enqueue `EmbeddingPayload` to `QUEUE_EMBEDDING`
 4. If `is_question=False`: no further processing
+5. Publish `comment_classified` WebSocket event via Redis pub/sub
 
 ## Output (on is_question=True)
 
@@ -36,6 +37,15 @@ Enqueues to `QUEUE_EMBEDDING`. See [workers/embeddings.md](embeddings.md).
 <!-- What happens if Gemini call fails? Retry behavior? -->
 See [workers/overview.md](overview.md) for retry policy.
 
+## Circuit Breaker
+
+The classification worker implements a circuit breaker pattern for Gemini API calls:
+- **Closed** (normal): calls proceed
+- **Open** (failing): calls skipped, task retried later
+- **Half-open** (probing): single call attempted to test recovery
+
+State exported as Prometheus gauge: `gemini_circuit_state{worker_name="classification"}`
+
 ## DB Updates
 
 | Field | Value |
diff --git a/docs/workers/clustering.md b/docs/workers/clustering.md
index 48884ab..7a89c21 100644
--- a/docs/workers/clustering.md
+++ b/docs/workers/clustering.md
@@ -1,6 +1,6 @@
 # Clustering Worker
 
-> Purpose: ClusteringPayload → cosine similarity at threshold 0.8 → Cluster CRUD → enqueue to answer_generation.
+> Purpose: ClusteringPayload → cosine similarity at threshold 0.65 → Cluster CRUD → enqueue to answer_generation.
 
 <!-- Populate from: workers/clustering/worker.py, workers/common/schemas.py -->
 
@@ -19,19 +19,24 @@
 
 1. Load `Comment.embedding` from DB
 2. Query existing `Cluster` centroids for this session (pgvector cosine similarity)
-3. If similarity ≥ 0.8: assign comment to that cluster, update centroid
+3. If similarity ≥ 0.65: assign comment to that cluster, update centroid
 4. If no cluster found: create a new `Cluster` with this comment as seed
 5. Enqueue `AnswerGenerationPayload` to `QUEUE_ANSWER_GENERATION`
 
 ## Similarity Threshold
 
-Default: `0.8` (cosine similarity). Higher = stricter grouping.
-
-<!-- Is this configurable via settings? Document config key if so. -->
+Default: `0.65` (cosine similarity, configurable via `clustering_similarity_threshold` in settings). Higher = stricter grouping.
 
 ## Centroid Update
 
-<!-- How is the cluster centroid recalculated when a new comment is added? Mean of all embeddings? -->
+When a comment joins an existing cluster, the centroid is recalculated:
+
+```python
+new_centroid = (old_centroid * comment_count + new_embedding) / (comment_count + 1)
+normalized_centroid = new_centroid / ||new_centroid||  # L2 normalize
+```
+
+The `comment_count` is denormalized on the Cluster model for this formula.
 
 ## DB Operations
 
@@ -46,3 +51,11 @@ For field definitions see [data/schema.md](../data/schema.md).
 ## Output
 
 Enqueues to `QUEUE_ANSWER_GENERATION`. See [workers/answer-generation.md](answer-generation.md).
+
+## Answer Generation Triggers
+
+Answer generation is NOT triggered for every clustered comment. The clustering worker enqueues to `QUEUE_ANSWER_GENERATION` only when:
+- A **new cluster** is created, OR
+- A cluster reaches comment count milestones: **3, 10, 25**
+
+At count 3, the cluster title is also summarized via Gemini.
diff --git a/docs/workers/embeddings.md b/docs/workers/embeddings.md
index ee19c95..7d071a4 100644
--- a/docs/workers/embeddings.md
+++ b/docs/workers/embeddings.md
@@ -1,6 +1,6 @@
 # Embeddings Worker
 
-> Purpose: EmbeddingPayload → 1536-dim vector stored in pgvector → enqueue to clustering queue.
+> Purpose: EmbeddingPayload → 768-dim vector stored in pgvector → enqueue to clustering queue.
 
 <!-- Populate from: workers/embeddings/worker.py, workers/common/schemas.py -->
 
@@ -19,14 +19,15 @@
 ## Processing
 
 1. Call `GeminiClient.generate_embedding(text)`
-2. Store 1536-dim vector in `Comment.embedding` (pgvector column)
+2. Store 768-dim vector in `Comment.embedding` (pgvector column)
 3. Enqueue `ClusteringPayload` to `QUEUE_CLUSTERING`
 
 ## Embedding Model
 
-- Provider: Google Gemini embedding API
-- Dimensions: 1536
-- Storage: `pgvector` `Vector(1536)` column on `Comment` model
+- Provider: Google Gemini (`gemini-embedding-001`)
+- Dimensions: 768
+- Storage: `pgvector` `Vector(768)` column on `Comment` model
+- Indexing: HNSW index (`vector_cosine_ops`, m=16, ef_construction=64)
 
 See [architecture/decisions/ADR-001-pgvector.md](../architecture/decisions/ADR-001-pgvector.md)
 for the rationale for pgvector.
@@ -39,6 +40,6 @@ Enqueues to `QUEUE_CLUSTERING`. See [workers/clustering.md](clustering.md).
 
 | Field | Value |
 |-------|-------|
-| `Comment.embedding` | `Vector(1536)` |
+| `Comment.embedding` | `Vector(768)` |
 
 For field definitions see [data/schema.md](../data/schema.md).
diff --git a/docs/workers/overview.md b/docs/workers/overview.md
index 39d016b..cad8be3 100644
--- a/docs/workers/overview.md
+++ b/docs/workers/overview.md
@@ -56,7 +56,25 @@ See [state/runbooks/worker-crash.md](../state/runbooks/worker-crash.md) for DLQ
 
 ## runner.py
 
-<!-- How to start workers; environment setup; which workers must run simultaneously -->
+> **Stub**: `workers/runner.py` is not implemented. `make run-workers` calls it but
+> it does nothing useful. Use `./start_dev.sh` instead, which launches all workers
+> via tmux.
+
+### start_dev.sh Workers
+
+`./start_dev.sh` starts 9 processes in a tmux session:
+
+| Pane | Process |
+|------|---------|
+| 0 | Backend (uvicorn) |
+| 1 | Classification worker |
+| 2 | Embeddings worker |
+| 3 | Clustering worker |
+| 4 | Answer generation worker |
+| 5 | YouTube polling worker |
+| 6 | YouTube posting worker |
+| 7 | Scheduler (APScheduler — quota reset + token cleanup) |
+| 8 | Frontend (Vite dev server) |
 
 ## Priority
 
diff --git a/docs/workers/trigger-monitor.md b/docs/workers/trigger-monitor.md
index 6aedded..cc5a4da 100644
--- a/docs/workers/trigger-monitor.md
+++ b/docs/workers/trigger-monitor.md
@@ -1,44 +1,22 @@
 # Trigger Monitor Worker
 
-> Purpose: Count/interval thresholds for active sessions → dispatch ClusteringPayload.
+> **Status: STUB** — This worker is not implemented and is not started by `start_dev.sh`.
 
-<!-- Populate from: workers/trigger_monitor/worker.py (verify path) -->
+## Current State
 
-## Role
+`workers/trigger_monitor/worker.py` contains an empty infinite loop. It is not
+started by `start_dev.sh` and does not process any tasks.
 
-The trigger monitor watches active sessions and dispatches clustering jobs when
-enough new questions have accumulated, or when a time interval has elapsed.
-This prevents over-clustering (every new question triggering a re-cluster) while
-ensuring timely processing.
+Do not build on this worker without a full implementation plan.
 
-## Thresholds
+## Original Intent
 
-| Threshold | Default | Config Key |
-|-----------|---------|-----------|
-| Count threshold | <!-- N --> comments | <!-- key --> |
-| Interval threshold | <!-- M --> seconds | <!-- key --> |
-
-<!-- Populate with actual values from worker config -->
-
-## Algorithm
-
-```
-For each active session:
-  pending_count = count of questions with embedding but no cluster assignment
-  time_since_last_cluster = now - session.last_clustered_at
-
-  if pending_count >= COUNT_THRESHOLD or time_since_last_cluster >= INTERVAL_THRESHOLD:
-    for each pending comment:
-      enqueue ClusteringPayload → QUEUE_CLUSTERING
-    session.last_clustered_at = now
-```
-
-## Session Tracking
-
-<!-- How does the monitor know which sessions are active? Redis set? DB query? -->
+The trigger monitor was intended to watch active sessions and dispatch clustering
+jobs based on count/interval thresholds, preventing over-clustering while ensuring
+timely processing. This responsibility is currently handled inline by the
+clustering worker itself, which triggers answer generation at milestones {3, 10, 25}.
 
 ## Relationship to Clustering Worker
 
-The trigger monitor dispatches to `QUEUE_CLUSTERING`.
-The clustering worker processes individual comments.
+The clustering worker directly enqueues to `QUEUE_ANSWER_GENERATION` at milestones.
 See [workers/clustering.md](clustering.md).
diff --git a/docs/workers/youtube-polling.md b/docs/workers/youtube-polling.md
index a7c5f35..ff253ba 100644
--- a/docs/workers/youtube-polling.md
+++ b/docs/workers/youtube-polling.md
@@ -11,14 +11,14 @@ New comments are persisted to the DB and enqueued for classification.
 
 ## Parallelism
 
-`ThreadPoolExecutor` — one thread per active session. Sessions are polled independently.
+`ThreadPoolExecutor` — one thread per active session. Sessions are polled independently. Maximum 10 concurrent threads (`max_workers=min(active_sessions, 10)`).
 
 ## chat_id Caching
 
 The YouTube `liveChatId` for a video is resolved once (costs 1 quota unit via
 `YouTubeClient.get_live_chat_id()`) and cached in Redis:
 
-Redis key pattern: <!-- document the key pattern used -->
+Redis key pattern: `youtube:poll:{session_id}:chat_id` (TTL: 3600s)
 
 On cache miss: call YouTube API and cache the result.
 
@@ -32,7 +32,7 @@ For manual comments (not from YouTube): `youtube_comment_id = f"manual:{uuid4()}
 
 ## Polling Interval
 
-<!-- What is the polling interval? Is it configurable? -->
+5-second full polling cycle. Each active session is polled in parallel within the cycle.
 
 ## Quota Cost
 
@@ -43,6 +43,7 @@ See [data/quota-model.md](../data/quota-model.md) for enforcement.
 
 1. Persist `Comment` record to DB
 2. Enqueue `ClassificationPayload` to `QUEUE_CLASSIFICATION`
+3. Publish `comment_created` WebSocket event via Redis pub/sub
 
 ## Error Handling
 
diff --git a/docs/workers/youtube-posting.md b/docs/workers/youtube-posting.md
index 0b9724a..9b2b4f5 100644
--- a/docs/workers/youtube-posting.md
+++ b/docs/workers/youtube-posting.md
@@ -11,16 +11,16 @@
 ```json
 {
   "answer_id": "uuid",
-  "session_id": "uuid",
-  "answer_text": "string",
-  "chat_id": "string"
+  "session_id": "uuid"
 }
 ```
 
 ## Processing
 
-1. Read `chat_id` from Redis cache (set by youtube_polling worker)
-2. Call `YouTubeClient.post_message(chat_id, answer_text)`
+1. Fetch Answer, StreamingSession, and YouTubeToken from DB
+2. Check YouTube quota before posting (costs 50 units)
+3. Get `live_chat_id` from Redis cache (set by youtube_polling worker); retry if not cached
+4. Call `YouTubeClient.post_message(chat_id, answer_text)`
 3. Update `Answer.is_posted = True`
 4. Emit `answer_posted` WebSocket event to session
 
@@ -34,7 +34,7 @@ See [data/quota-model.md](../data/quota-model.md) — do not duplicate the quota
 ## chat_id Source
 
 The `chat_id` is cached in Redis by the youtube_polling worker.
-Redis key pattern: <!-- document the key pattern -->
+Redis key pattern: `youtube:poll:{session_id}:chat_id` (TTL: 3600s)
 
 ## Error Handling
 
diff --git a/frontend/index.html b/frontend/index.html
index 98e0244..3c63d62 100644
--- a/frontend/index.html
+++ b/frontend/index.html
@@ -6,7 +6,7 @@
     <title>AI Live Doubt Manager</title>
     <link rel="preconnect" href="https://fonts.googleapis.com">
     <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
-    <link href="https://fonts.googleapis.com/css2?family=Azeret+Mono:wght@400;700&family=Outfit:wght@300;400;500;600;700&display=swap" rel="stylesheet">
+    <link href="https://fonts.googleapis.com/css2?family=Azeret+Mono:wght@400;700&family=Bebas+Neue&family=Outfit:wght@300;400;500;600;700&family=Syne:wght@400;600;700;800&display=swap" rel="stylesheet">
   </head>
   <body>
     <div id="root"></div>
diff --git a/frontend/src/index.css b/frontend/src/index.css
index c60e880..cc292c3 100644
--- a/frontend/src/index.css
+++ b/frontend/src/index.css
@@ -1148,48 +1148,122 @@ button:disabled { opacity: 0.4; pointer-events: none; }
   font-family: var(--font-display); font-size: 10px; color: var(--color-text);
 }
 
+/* ============================================================
+   Landing Page — Keyframe Animations
+   ============================================================ */
+@keyframes fadeSlideUp {
+  from { opacity: 0; transform: translateY(24px); }
+  to   { opacity: 1; transform: translateY(0); }
+}
+@keyframes fadeSlideDown {
+  from { opacity: 0; transform: translateY(-16px); }
+  to   { opacity: 1; transform: translateY(0); }
+}
+
 /* ============================================================
    Landing Page
    ============================================================ */
-.landing { display: flex; flex-direction: column; min-height: 100vh; }
+.landing {
+  display: flex; flex-direction: column; min-height: 100vh;
+  background-color: var(--color-bg);
+  background-image:
+    radial-gradient(ellipse 80% 60% at 50% 0%, rgba(255,107,53,0.08) 0%, transparent 70%),
+    radial-gradient(circle 1px at center, rgba(255,107,53,0.12) 0%, transparent 1px);
+  background-size: 100% 100%, 24px 24px;
+}
 
+/* --- Navbar --- */
 .landing-header {
-  background: var(--color-surface); border-bottom: 1px solid var(--color-border);
-  padding: 0 48px; height: 60px; display: flex; align-items: center; justify-content: space-between;
+  position: sticky; top: 0; z-index: 100;
+  background: rgba(8,9,13,0.7);
+  backdrop-filter: blur(12px); -webkit-backdrop-filter: blur(12px);
+  border-bottom: 1px solid rgba(255,107,53,0.1);
+  padding: 0 48px; height: 60px;
+  display: flex; align-items: center; justify-content: space-between;
 }
-
 .landing-nav { display: flex; gap: 12px; align-items: center; }
 .landing-nav .btn-primary { width: auto; margin-top: 0; }
 
-.landing-hero { text-align: center; padding: 64px 24px 48px; max-width: 680px; margin: 0 auto; }
+/* --- Staggered entrance animations --- */
+.anim-nav { animation: fadeSlideDown 0.6s ease-out both; }
+.anim-1, .anim-2, .anim-3, .anim-4 { opacity: 0; }
+.anim-1 { animation: fadeSlideUp 0.7s ease-out 0s both; }
+.anim-2 { animation: fadeSlideUp 0.7s ease-out 0.15s both; }
+.anim-3 { animation: fadeSlideUp 0.7s ease-out 0.3s both; }
+.anim-4 { animation: fadeSlideUp 0.7s ease-out 0.45s both; }
+
+/* --- Hero --- */
+.landing-hero {
+  position: relative;
+  text-align: center; padding: 100px 24px 80px;
+  max-width: 800px; margin: 0 auto;
+}
+.landing-hero::before {
+  content: '';
+  position: absolute; inset: -40px -80px;
+  background: radial-gradient(ellipse 50% 40% at 50% 30%, rgba(255,107,53,0.10) 0%, transparent 70%);
+  pointer-events: none; z-index: 0;
+}
+.landing-hero > * { position: relative; z-index: 1; }
 
 .landing-hero h1 {
-  font-family: var(--font-display); font-size: 28px; font-weight: 700;
-  color: var(--color-accent); margin-bottom: 16px; letter-spacing: 0.04em;
+  font-family: 'Bebas Neue', sans-serif; font-size: 72px; font-weight: 400;
+  color: var(--color-accent); margin-bottom: 20px; letter-spacing: 0.03em;
+  line-height: 1;
+}
+.landing-hero .tagline {
+  font-family: 'Syne', sans-serif; font-size: 22px; font-weight: 600;
+  color: var(--color-text); margin-bottom: 14px;
+}
+.landing-hero .subtitle {
+  font-family: 'Syne', sans-serif; font-size: 16px; font-weight: 400;
+  color: var(--color-muted); margin-bottom: 40px; line-height: 1.7;
 }
 
-.landing-hero .tagline { font-size: 18px; font-weight: 600; color: var(--color-text); margin-bottom: 12px; }
-.landing-hero .subtitle { font-size: 14px; color: var(--color-muted); margin-bottom: 32px; line-height: 1.7; }
+.landing-ctas { display: flex; gap: 16px; justify-content: center; flex-wrap: wrap; }
+.landing-ctas .btn { padding: 12px 32px; font-size: 13px; }
+.landing-ctas .btn-primary { width: auto; margin-top: 0; transition: box-shadow 0.3s ease, background 0.15s ease; }
+.landing-ctas .btn-primary:hover { box-shadow: 0 0 20px rgba(255,107,53,0.4); }
 
-.landing-ctas { display: flex; gap: 12px; justify-content: center; flex-wrap: wrap; }
-.landing-ctas .btn { padding: 10px 28px; font-size: 12px; }
-.landing-ctas .btn-primary { width: auto; margin-top: 0; }
+/* --- Features --- */
+.landing-features {
+  padding: 56px 24px; flex: 1;
+}
+.landing-features h2 {
+  font-family: 'Syne', sans-serif; text-align: center;
+  font-size: 11px; font-weight: 700; text-transform: uppercase;
+  letter-spacing: 0.18em; color: var(--color-muted);
+  margin-bottom: 48px;
+}
 
-.landing-features { background: var(--color-surface); border-top: 1px solid var(--color-border); padding: 48px 24px; flex: 1; }
+.features-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 56px; max-width: 960px; margin: 0 auto; }
 
-.landing-features h2 {
-  font-family: var(--font-display); text-align: center; font-size: 14px; font-weight: 700;
-  margin-bottom: 32px; color: var(--color-text); letter-spacing: 0.06em; text-transform: uppercase;
+.feature-card {
+  position: relative;
+  background: none; border: none; box-shadow: none;
+  padding: 0; text-align: left;
 }
 
-.features-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 20px; max-width: 900px; margin: 0 auto; }
+.card-number {
+  font-size: 11px; font-weight: 600;
+  letter-spacing: 0.15em;
+  color: rgba(255,255,255,0.4); line-height: 1;
+  margin-bottom: 12px;
+}
 
-.feature-card { background: var(--color-bg); border: 1px solid var(--color-border); padding: 24px; text-align: center; }
-.feature-emoji { font-size: 28px; margin-bottom: 12px; }
-.feature-card h3 { font-family: var(--font-display); font-size: 11px; font-weight: 700; margin-bottom: 8px; letter-spacing: 0.06em; }
-.feature-card p  { font-size: 12px; color: var(--color-muted); line-height: 1.6; }
+.feature-emoji { font-size: 40px; margin-bottom: 16px; }
+.feature-card h3 {
+  font-family: 'Syne', sans-serif; font-size: 12px; font-weight: 700;
+  margin-bottom: 10px; letter-spacing: 0.06em; text-transform: uppercase;
+}
+.feature-card p { font-size: 13px; color: var(--color-muted); line-height: 1.8; }
 
-.landing-footer { background: var(--color-surface); border-top: 1px solid var(--color-border); padding: 20px 24px; text-align: center; font-size: 12px; color: var(--color-muted); }
+/* --- Footer --- */
+.landing-footer {
+  border-top: 1px solid rgba(255,107,53,0.1);
+  padding: 32px 24px; text-align: center;
+  font-family: 'Syne', sans-serif; font-size: 12px; color: var(--color-muted);
+}
 
 /* ============================================================
    Custom Scrollbars — thin, accent thumb
@@ -1253,7 +1327,9 @@ button:disabled { opacity: 0.4; pointer-events: none; }
 
 @media (max-width: 700px) {
   .landing-header { padding: 0 20px; }
-  .landing-hero h1 { font-size: 22px; }
+  .landing-hero { padding: 60px 20px 48px; }
+  .landing-hero h1 { font-size: 40px; }
+  .landing-hero .tagline { font-size: 17px; }
   .features-grid   { grid-template-columns: 1fr; }
   .landing-ctas    { flex-direction: column; align-items: center; }
 }
diff --git a/frontend/src/pages/LandingPage.jsx b/frontend/src/pages/LandingPage.jsx
index 5db739b..ae301b6 100644
--- a/frontend/src/pages/LandingPage.jsx
+++ b/frontend/src/pages/LandingPage.jsx
@@ -8,7 +8,7 @@ export default function LandingPage() {
   return (
     <div className="landing">
       {/* Header */}
-      <header className="landing-header">
+      <header className="landing-header anim-nav">
         <span className="logo">AI Doubt Manager</span>
         <nav className="landing-nav">
           <Link to="/login" className="btn btn-sm">Login</Link>
@@ -18,12 +18,12 @@ export default function LandingPage() {
 
       {/* Hero */}
       <section className="landing-hero">
-        <h1>AI Live Doubt Manager</h1>
-        <p className="tagline">Intelligent Q&A management for YouTube live streams</p>
-        <p className="subtitle">
+        <h1 className="anim-1">AI Live Doubt Manager</h1>
+        <p className="tagline anim-2">Intelligent Q&A management for YouTube live streams</p>
+        <p className="subtitle anim-3">
           Automatically detect, cluster, and answer student questions in real-time using AI
         </p>
-        <div className="landing-ctas">
+        <div className="landing-ctas anim-4">
           <Link to="/register" className="btn btn-primary">Get Started</Link>
           <Link to="/login" className="btn">Login</Link>
         </div>
@@ -34,18 +34,21 @@ export default function LandingPage() {
         <h2>How It Works</h2>
         <div className="features-grid">
           <div className="feature-card">
+            <span className="card-number">01</span>
             <div className="feature-emoji">🔗</div>
-            <h3>1. Connect Your Stream</h3>
+            <h3>Connect Your Stream</h3>
             <p>Link your YouTube live stream and let the system start monitoring comments in real-time.</p>
           </div>
           <div className="feature-card">
+            <span className="card-number">02</span>
             <div className="feature-emoji">🤖</div>
-            <h3>2. AI Processes Questions</h3>
+            <h3>AI Processes Questions</h3>
             <p>AI classifies comments as questions and clusters similar ones together automatically.</p>
           </div>
           <div className="feature-card">
+            <span className="card-number">03</span>
             <div className="feature-emoji">💡</div>
-            <h3>3. Review & Post Answers</h3>
+            <h3>Review & Post Answers</h3>
             <p>Generate comprehensive answers using RAG and Gemini AI, then post them to your audience.</p>
           </div>
         </div>
diff --git a/frontend/src/services/api.js b/frontend/src/services/api.js
index 00e130a..7b8526b 100644
--- a/frontend/src/services/api.js
+++ b/frontend/src/services/api.js
@@ -3,6 +3,13 @@
  * Base helper redirects to /login on 401.
  */
 
+function parseDetail(detail) {
+  if (!detail) return null;
+  if (typeof detail === 'string') return detail;
+  if (Array.isArray(detail)) return detail.map(d => d.msg || JSON.stringify(d)).join('; ');
+  return String(detail);
+}
+
 let refreshPromise = null;
 
 export async function refreshAccessToken() {
@@ -74,7 +81,7 @@ async function apiFetch(path, options = {}, token = null) {
 
   if (!res.ok) {
     const body = await res.json().catch(() => ({}));
-    throw new Error(body.detail || body.message || `HTTP ${res.status}`);
+    throw new Error(parseDetail(body.detail) || body.message || `HTTP ${res.status}`);
   }
 
   return res.status === 204 ? null : res.json();
diff --git a/start_dev.sh b/start_dev.sh
index 22de805..db1ae14 100755
--- a/start_dev.sh
+++ b/start_dev.sh
@@ -65,7 +65,7 @@ CMDS=(
   "python workers/youtube_polling/worker.py"
   "python workers/youtube_posting/worker.py"
   "cd workers && python -m scheduler.worker"
-  "npm run dev"
+  "echo 'Waiting for backend on :8000...' && until curl -sf http://localhost:8000/docs >/dev/null 2>&1; do sleep 1; done && echo 'Backend ready — starting Vite' && npm run dev"
 )
 
 # Build the shell command for a given pane index.