Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .playwright-mcp/current-ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .playwright-mcp/final-result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .playwright-mcp/updated-ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
234 changes: 234 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Running Commands

**IMPORTANT:** This project uses `uv` as the package manager. **Always use `uv` commands - never use `pip` directly.**

### Start the Application
```bash
./run.sh
```
This starts the FastAPI server on port 8000 with auto-reload enabled. The application will:
1. Load course documents from `docs/` folder
2. Process them into 800-char chunks with 100-char overlap
3. Create/load ChromaDB embeddings (first run downloads 90MB embedding model)
4. Serve the web interface at http://localhost:8000

The `run.sh` script uses `uv run` internally.

### Manual Start (Development)
```bash
cd backend
uv run uvicorn app:app --reload --port 8000
```

### Install Dependencies
```bash
uv sync
```

### Add New Dependencies
```bash
# Add a new package
uv add package-name

# Add a dev dependency
uv add --dev package-name
```

### Run Python Scripts
```bash
# Always use uv run to execute Python code
uv run python script.py

# NOT: python script.py
# NOT: pip install ...
```

### Environment Setup
Create `.env` file with:
```
ANTHROPIC_API_KEY=sk-ant-api03-...
```

## Architecture Overview

### RAG System Flow (Tool-Based Architecture)

This is a **tool-based RAG system** where Claude decides when to search, not a traditional "always search" RAG.

**Query Processing Flow:**
1. User query → FastAPI endpoint (`/api/query`)
2. RAG System orchestrates the flow
3. **First Claude API call**: Claude receives query + tool definition, decides if search is needed
4. If search needed: Tool execution → Vector search → Format results
5. **Second Claude API call**: Claude receives search results, synthesizes final answer
6. Response + sources returned to frontend

**Key Insight:** There are **two Claude API calls per query** - one for decision-making, one for synthesis.

### Component Architecture

**Frontend** (`frontend/`)
- Vanilla JS (no framework)
- Uses `marked.js` for markdown rendering
- Session-based conversation tracking
- Displays collapsible source citations

**Backend** (`backend/`)
- **app.py**: FastAPI server, REST endpoints, startup document loading
- **rag_system.py**: Main orchestrator - coordinates all components
- **ai_generator.py**: Claude API wrapper with tool calling support
- System prompt defines search behavior (one search max, no meta-commentary)
- Handles two-phase tool execution (request → execute → synthesize)
- **vector_store.py**: ChromaDB interface with two collections
- `course_catalog`: For fuzzy course name matching (e.g., "MCP" → full title)
- `course_content`: Actual content chunks for semantic search
- **document_processor.py**: Parses structured course documents into chunks
- Sentence-based chunking (preserves semantic boundaries)
- Adds context prefixes: "Course X Lesson Y content: ..."
- **search_tools.py**: Tool abstraction layer
- `CourseSearchTool`: Implements search with course/lesson filtering
- `ToolManager`: Registers and routes tool calls from Claude
- **session_manager.py**: Conversation history (max 2 exchanges by default)
- **config.py**: Centralized configuration (see below)

### Data Models (`models.py`)

**Important:** `Course.title` is used as the unique identifier throughout the system.

- **Course**: Contains title (ID), instructor, link, and list of Lessons
- **Lesson**: Contains lesson_number, title, and link
- **CourseChunk**: Contains content, course_title (FK), lesson_number, chunk_index

### Vector Store Design

**Two-Collection Architecture:**
1. **course_catalog** collection:
- Purpose: Fuzzy course name resolution
- Documents: "Course: {title} taught by {instructor}" + lesson entries
- Used when user says "MCP course" → resolves to full title

2. **course_content** collection:
- Purpose: Semantic search of actual content
- Documents: Text chunks with context prefixes
- Metadata: course_title, lesson_number, chunk_index, links
- Filtering: Can filter by exact course_title AND/OR lesson_number

**Search Flow:**
1. If course_name provided: Query `course_catalog` to resolve fuzzy name
2. Build ChromaDB filter: `{"$and": [{"course_title": "X"}, {"lesson_number": Y}]}`
3. Query `course_content` with semantic search + filters
4. Return top 5 chunks by cosine similarity

### Document Format

Course documents in `docs/` must follow this structure:
```
Course Title: [title]
Course Link: [url]
Course Instructor: [name]

Lesson 0: [title]
Lesson Link: [url]
[content...]

Lesson 1: [title]
Lesson Link: [url]
[content...]
```

The parser (`document_processor.py`) extracts this metadata and creates chunks with context.

### Configuration (`backend/config.py`)

Key settings to be aware of:
- `ANTHROPIC_MODEL`: "claude-sonnet-4-20250514" (Claude Sonnet 4)
- `EMBEDDING_MODEL`: "all-MiniLM-L6-v2" (384-dim vectors)
- `CHUNK_SIZE`: 800 chars (with CHUNK_OVERLAP: 100 chars)
- `MAX_RESULTS`: 5 search results returned to Claude
- `MAX_HISTORY`: 2 conversation exchanges kept in context
- `CHROMA_PATH`: "./chroma_db" (persistent vector storage)

### AI System Prompt Behavior

The system prompt in `ai_generator.py` defines critical behavior:
- **Use search tool ONLY for course-specific questions**
- **One search per query maximum** (prevents multiple searches)
- **No meta-commentary** (no "based on the search results" phrases)
- Responses must be: brief, educational, clear, example-supported

### Session Management

Sessions track conversation history:
- Session ID created on first query (e.g., "session_1")
- Stores last `MAX_HISTORY * 2` messages (user + assistant pairs)
- History formatted as: "User: ...\nAssistant: ...\n..." for context
- Appended to system prompt on subsequent queries in same session

### API Endpoints

**POST /api/query**
- Request: `{ "query": "...", "session_id": "session_1" (optional) }`
- Response: `{ "answer": "...", "sources": ["..."], "session_id": "..." }`
- Creates session if not provided

**GET /api/courses**
- Response: `{ "total_courses": 4, "course_titles": ["..."] }`
- Used by frontend sidebar

### ChromaDB Persistence

- First run: Downloads embedding model, creates collections, processes documents (~30-60 seconds)
- Subsequent runs: Loads existing ChromaDB from `./chroma_db` (fast startup)
- Documents only reprocessed if course title doesn't exist in catalog
- To rebuild: Delete `./chroma_db` folder and restart

### Development Notes

**Adding New Documents:**
1. Place `.txt`, `.pdf`, or `.docx` files in `docs/` folder
2. Follow the document format structure above
3. Restart server - documents auto-loaded on startup
4. Check logs for: "Added new course: X (Y chunks)"

**Modifying Chunk Size:**
- Edit `config.py`: `CHUNK_SIZE` and `CHUNK_OVERLAP`
- Delete `./chroma_db` folder to force reprocessing
- Restart application

**Debugging Search:**
- Search tool tracks sources in `last_sources` attribute
- Sources shown in UI as collapsible section
- Check `vector_store.py` for filter logic

**Conversation Context:**
- Modify `MAX_HISTORY` in `config.py` to change context window
- History is string-formatted and prepended to system prompt
- Trade-off: More history = more context but higher token usage

### Tool-Based vs Traditional RAG

**This system is NOT a traditional RAG** where every query triggers a search. Instead:
- Claude analyzes each query and decides if search is warranted
- General knowledge questions answered without search
- Course-specific questions trigger tool use
- This reduces unnecessary vector searches and improves response quality

### Frontend-Backend Contract

**Frontend maintains:**
- Current session_id in memory
- Sends with each query for conversation continuity

**Backend returns:**
- answer: The synthesized response from Claude
- sources: List of "Course Title - Lesson N" strings for UI
- session_id: Same or newly created session ID

**Source Tracking:**
- Search tool stores sources during execution
- RAG system retrieves after AI generation completes
- Sources reset after each query to prevent leakage
135 changes: 135 additions & 0 deletions DOCUMENTATION_STRUCTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# RAG Chatbot Documentation Structure

## Generated Files

### Standalone Mermaid Diagrams
1. **architecture-diagram.mermaid** - 4-layer system architecture (high-level)
2. **sequence-diagram.mermaid** - 21-step end-to-end user flow
3. **rag-deep-dive.mermaid** - RAG & Storage component architecture
4. **rag-mid-level-sequence.mermaid** - 17-step RAG processing flow with decision branching

### Interactive HTML Documentation
**architecture-diagram.html** - Complete 4-tab interactive documentation

## Tab Organization

### Tab 1: System Architecture (High-Level)
- **Purpose**: Overall system structure
- **Diagram**: 4-layer vertical architecture
- Layer 1: Frontend (Vanilla HTML/CSS/JS)
- Layer 2: API (FastAPI)
- Layer 3: RAG/AI (Claude + Tools)
- Layer 4: Database/Storage (ChromaDB)
- **Overview**: Component descriptions for each layer

### Tab 2: System User Flow (High-Level)
- **Purpose**: End-to-end user journey
- **Diagram**: 21-step sequence diagram
- Steps 1-3: User interaction
- Steps 4-6: Session & context
- Steps 7-16: RAG processing
- Steps 17-21: Response & display
- **Overview**: Flow breakdown by phase

### Tab 3: RAG Components (Deep Dive)
- **Purpose**: Internal RAG & Storage architecture
- **Diagram**: Component architecture
- RAG/AI Layer: 5 components (RAG System, AI Generator, Tool Manager, Search Tool, Session Manager)
- Storage Layer: 6 components (ChromaDB, 2 Collections, Document Processor, Chunking, Files)
- Shows data flows and cross-layer connections
- **Overview**: Component descriptions and internal data flows

### Tab 4: RAG Processing Flow (Deep Dive)
- **Purpose**: Detailed RAG internal processing
- **Diagram**: 17-step mid-level sequence with decision branching
- Steps 1-4: Request & context loading
- Steps 5-6: AI decision point (search vs. direct response)
- Steps 7-13: Conditional search path
- Steps 14-17: Response & session management
- **Overview**:
- Flow breakdown by phase
- Search mechanics
- Data structures
- Processing pipeline
- Configuration details

## Key Features

### Diagram Characteristics
- **Vertical stacking**: Forced top-to-bottom layout using explicit layer connections
- **Color coding**: Consistent across all diagrams
- Blue (#e3f2fd): Frontend
- Orange (#fff3e0): API
- Green (#e8f5e9): RAG/AI
- Purple (#f3e5f5): Database/Storage
- **Readable text**: Single-line labels, no overlapping
- **Emojis**: Consistent visual markers for each component type

### UX Design
- **Overview-first layout**: Legend/breakdown appears ABOVE diagrams in all tabs
- **Tabbed interface**: Smooth transitions between perspectives
- **Responsive design**: Mobile-friendly layout
- **Interactive navigation**: Easy switching between high-level and deep-dive views

## Abstraction Levels

### Level 1: System Overview (Tabs 1 & 2)
- **Audience**: Stakeholders, product managers, new team members
- **Focus**: What the system does and how users interact with it
- **Diagrams**: 4-layer architecture + 21-step user flow

### Level 2: RAG Deep Dive (Tabs 3 & 4)
- **Audience**: Developers, architects, AI engineers
- **Focus**: How RAG and storage layers work internally
- **Diagrams**: Component architecture + 17-step processing flow with decision logic

## Documentation Prompt Template

**prompts/system-documentation-prompt.md** - Reusable template for future projects

Contains:
- Master prompt
- Step-by-step execution guide
- Quality checklist
- Common pitfalls
- Usage examples
- Version history

## Usage

### View Documentation
```bash
# Open in browser
open architecture-diagram.html
```

### Test Diagrams
1. Visit https://mermaid.live/
2. Paste contents of any .mermaid file
3. Verify rendering

### Reuse for Other Projects
1. Read prompts/system-documentation-prompt.md
2. Adapt master prompt to your codebase
3. Follow 4-phase workflow:
- Phase 1: Codebase exploration
- Phase 2: Architecture diagram
- Phase 3: Sequence diagram
- Phase 4: HTML documentation

## Success Criteria Met

✅ All 4 layers visible in top-to-bottom stack
✅ No overlapping text in diagrams
✅ Component overview appears before diagrams
✅ Diagrams reflect actual codebase architecture
✅ Tabs work correctly with smooth transitions
✅ HTML renders properly in all modern browsers
✅ Mid-level abstraction provides balance between overview and details
✅ Decision points (AI search logic) clearly visible

---

**Generated**: 2025-11-09
**Tool**: Claude Code + Mermaid.js
**Pattern**: 4-layer vertical architecture with RAG
Binary file added RAG Chatbot System Architecture.pdf
Binary file not shown.
Loading