A production-grade POC demonstrating voice-guided UI workflow for an e-commerce application.
Key Principle: Voice is an alternative input method, NOT a chatbot. It controls the same UI, calls the same APIs, and behaves exactly like typing or clicking.
| Layer | Technology |
|---|---|
| Frontend | React (JavaScript, no TypeScript) |
| Backend | Python (FastAPI) |
| Voice Transport | WebSocket (OpenAI Realtime API) |
| API Transport | HTTP REST |
| Styling | Tailwind CSS |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FRONTEND (React) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β βββββββββββ βββββββββββ βββββββββββ ββββββββββββββββββββββββ β
β β Home β βProducts β β Profile β β VoiceController β β
β β Page β β Page β β Page β β (Global, Sticky) β β
β βββββββββββ ββββββ¬βββββ βββββββββββ ββββββββββββ¬ββββββββββββ β
β β β β
β βββββββ΄ββββββ βββββββ΄ββββββ β
β β Filters β β WebSocket β β
β β + Grid β β Client β β
β βββββββ¬ββββββ βββββββ¬ββββββ β
β β β β
ββββββββββββββββββββββΌββββββββββββββββββββββββββββββββΌββββββββββββββββ
β HTTP β WebSocket
β /api/products β /ws
βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BACKEND (FastAPI) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β ββββββββββββββββββββββ ββββββββββββββββββββββββββββββ β
β β REST Endpoints β β WebSocket Handler β β
β β GET /api/products β β /ws β β
β βββββββββββ¬βββββββββββ ββββββββββββ¬ββββββββββββββββββ β
β β β β
β β ββββββββββββββββββββββββββ€ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ β
β β search_products() β β RealtimeClient β β
β β (Single Source API) ββββββ (OpenAI WebSocket) β β
β βββββββββββββββββββββββββββ ββββββββββββββββ¬βββββββββββββββ β
β β β β
β βΌ β β
β βββββββββββββββββββββββββββ β β
β β Product Data β β β
β β (In-memory/JSON) β β β
β βββββββββββββββββββββββββββ β β
ββββββββββββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β OpenAI Realtime API β
β (STT + LLM + TTS) β
ββββββββββββββββββββββββββ
User clicks filter β Frontend state updates β HTTP call to /api/products β UI renders
User speaks β Audio β WebSocket β OpenAI transcribes β
OpenAI calls search_products function β Backend executes β
Result sent to:
1. OpenAI (for voice response generation)
2. Frontend via WebSocket (ui_update event for UI rendering)
β OpenAI speaks response + Frontend updates UI simultaneously
Both flows end up calling the same search_products() function. Voice doesn't have special APIs.
BitComm/
βββ PRD.md # This document
βββ backend/
β βββ main.py # FastAPI app, HTTP + WebSocket endpoints
β βββ realtime_client.py # OpenAI Realtime API WebSocket client
β βββ tools.py # Function definitions for OpenAI
β βββ products.py # Product data + search_products function
β βββ requirements.txt # Python dependencies
β βββ .env # Environment variables (OPENAI_API_KEY)
βββ frontend/
βββ package.json
βββ public/
β βββ index.html
βββ src/
βββ index.js
βββ App.js
βββ pages/
β βββ Home.js
β βββ Products.js
β βββ Profile.js
βββ components/
β βββ Navbar.js
β βββ ProductGrid.js
β βββ ProductCard.js
β βββ FilterSidebar.js
β βββ SearchBar.js
β βββ VoiceController.js
βββ hooks/
β βββ useProducts.js
β βββ useVoice.js
βββ context/
β βββ VoiceContext.js
βββ styles/
βββ index.css
Search and filter products.
Query Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | string | null | Text search in name/description |
| category | string | null | Filter by category |
| min_price | int | null | Minimum price filter |
| max_price | int | null | Maximum price filter |
| brand | string | null | Filter by brand |
| sort_by | string | "relevance" | Sort: price_asc, price_desc, rating, relevance |
| limit | int | 20 | Max products to return |
Response:
{
"success": true,
"data": {
"products": [...],
"total": 15,
"filters_applied": {
"category": "mobiles",
"max_price": 10000
}
},
"metadata": {
"available_categories": ["mobiles", "laptops", "accessories"],
"available_brands": ["Samsung", "Apple", "OnePlus", ...],
"price_range": {"min": 199, "max": 149999}
}
}Real-time voice communication.
Client β Server:
- Binary audio data (PCM16, 24kHz)
- JSON control messages (optional)
Server β Client:
- Binary audio data (AI response)
- JSON events:
ui_update: Update UI based on voice commandtranscript_update: Show transcriptionclear_audio_queue: Handle interruptionerror: Error messages
ui_update Event Format:
{
"type": "ui_update",
"action": "SHOW_PRODUCTS",
"navigate_to": "/products",
"filters": {
"category": "mobiles",
"max_price": 10000
},
"data": {
"products": [...],
"total": 7
},
"assistant_message": "Here are 7 mobile phones under βΉ10,000"
}{
"id": "MOB001",
"name": "Samsung Galaxy M14 5G",
"category": "mobiles",
"brand": "Samsung",
"price": 11999,
"rating": 4.2,
"thumbnail": "/images/mob001.jpg",
"specs": {
"display": "6.6 inch FHD+",
"processor": "Exynos 1330",
"ram": "4GB",
"storage": "64GB",
"battery": "6000mAh",
"camera": "50MP Triple"
},
"in_stock": true,
"description": "Budget 5G smartphone with massive battery"
}| Category | Products | Price Range |
|---|---|---|
| Mobiles | 12 | βΉ6,999 - βΉ1,49,999 |
| Laptops | 10 | βΉ29,999 - βΉ2,49,999 |
| Accessories | 12 | βΉ199 - βΉ24,999 |
| Voice Command | Action | Parameters |
|---|---|---|
| "Show me mobile phones" | Navigate + Filter | category: mobiles |
| "Show laptops under 50000" | Navigate + Filter | category: laptops, max_price: 50000 |
| "Filter by Samsung" | Apply Filter | brand: Samsung |
| "Sort by price low to high" | Apply Sort | sort_by: price_asc |
| "Show me everything" | Clear Filters | (none) |
| Voice Command | Action |
|---|---|
| "Tell me about the first product" | Show details of product at index 0 |
| "What are the specs of the second one" | Show specs of product at index 1 |
| "Compare first and third" | Show comparison view |
| Voice Command | Action |
|---|---|
| "Go to home page" | Navigate to / |
| "Show all products" | Navigate to /products |
| "Open my profile" | Navigate to /profile |
{
"type": "function",
"name": "search_products",
"description": "Search and filter products in the e-commerce store. Use this when user wants to find, filter, or browse products.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search text to find in product names"
},
"category": {
"type": "string",
"enum": ["mobiles", "laptops", "accessories"],
"description": "Product category to filter by"
},
"min_price": {
"type": "integer",
"description": "Minimum price in INR"
},
"max_price": {
"type": "integer",
"description": "Maximum price in INR"
},
"brand": {
"type": "string",
"description": "Brand name to filter by"
},
"sort_by": {
"type": "string",
"enum": ["price_asc", "price_desc", "rating", "relevance"],
"description": "Sort order for results"
}
}
}
}voiceSession: WebSocket connection stateisListening: Whether voice is activelastProducts: Products from last query (for follow-ups)
products: Current product listfilters: Active filterssortBy: Current sort orderloading: Loading state
When ui_update event received:
- If
navigate_topresent β Router navigates - Update
filtersstate - Update
productsfromdata.products - Display
assistant_messagein voice transcript area
- Floating button (bottom-right, z-50)
- Persistent across all pages
- Visual states: idle, listening, processing, speaking
- Shows real-time transcript
- Waveform animation when active
- Category checkboxes
- Price range slider (βΉ0 - βΉ250,000)
- Brand multi-select
- Clear all filters button
- Responsive grid (1-4 columns)
- ProductCard with image, name, price, rating
- Loading skeleton
- Empty state
| Scenario | Behavior |
|---|---|
| OpenAI connection fails | Show error toast, disable voice button |
| Invalid voice command | AI asks for clarification |
| No products match | Show empty state + AI explains |
| Network error | Retry with exponential backoff |
| Audio permission denied | Show permission request modal |
- Functional Parity: Voice search returns identical results to typed search
- UI Stability: UI never breaks due to LLM errors
- Single API: Same
search_products()powers all interactions - Responsiveness: UI updates < 200ms after voice command processed
- Professional UX: Demo feels like a real e-commerce site
- PRD documentation
- Backend product data
- search_products API
- Basic React app with routing
- Product grid and cards
- Filter sidebar
- Navigation
- WebSocket endpoint
- OpenAI Realtime client
- VoiceController component
- ui_update event handling
- Animations and transitions
- Error handling
- Loading states
- Mobile responsiveness