Example implementations for building real-time voice AI applications with Sayna.
Sayna is a real-time voice processing platform that provides:
- Speech-to-Text (STT) - Real-time transcription
- Text-to-Speech (TTS) - Natural voice synthesis
- LiveKit Integration - Multi-participant voice rooms
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMPLETE SYSTEM β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Your Application β β
β β β β
β β βββββββββββββββββββββββ βββββββββββββββββββββββ β β
β β β React Frontend β β NestJS Backend β β β
β β β (Browser) β β (Server) β β β
β β β β β β β β
β β β @sayna-ai/js-sdk β β @sayna-ai/node-sdk β β β
β β ββββββββββββ¬βββββββββββ ββββββββββββ¬βββββββββββ β β
β β β β β β
β ββββββββββββββββΌβββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββ β
β β β β
β ββββββββββββββββΌβββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββ β
β β β Sayna Platform β β β
β β β β β β
β β βΌ βΌ β β
β β βββββββββββββββββββββββ βββββββββββββββββββββββ β β
β β β LiveKit Server β β Sayna API β β β
β β β (audio/text) β<ββββββ>β (STT/TTS) β β β
β β βββββββββββββββββββββββ βββββββββββββββββββββββ β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Example | Description | Tech Stack |
|---|---|---|
| react-vite-ui | Voice chat frontend | React, TypeScript, Vite, Sayna JS SDK |
| nestjs-ai-sdk-server | Voice AI backend | NestJS, Vercel AI SDK, Google Gemini, Sayna Node SDK |
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GET RUNNING IN 5 MINUTES β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Terminal 1 (Backend) Terminal 2 (Frontend) β
β βββββββββββββββββββββ βββββββββββββββββββββ β
β β
β cd nestjs-ai-sdk-server cd react-vite-ui β
β cp .env.example .env cp .env.example .env β
β # Edit .env with your keys # API_ENDPOINT is pre-set β
β npm install npm install β
β npm run start:dev npm run dev β
β β
β Server: localhost:4000 App: localhost:5173 β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Backend (nestjs-ai-sdk-server/.env):
SAYNA_URL=https://api.sayna.ai # Your Sayna API URL
SAYNA_API_KEY=your-key # Optional: API key
GOOGLE_GENERATIVE_AI_API_KEY=your-key # Required: Google AI key
PORT=4000Frontend (react-vite-ui/.env):
API_ENDPOINT=http://localhost:4000/start βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VOICE CONVERSATION FLOW β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
USER BROWSER SERVER AI
β β β β
β 1. Click "Call" β β β
ββββββββββββββββββββββ>β β β
β β 2. POST /start β β
β ββββββββββββββββββββββ>β β
β β 3. Token + URL β β
β β<ββββββββββββββββββββββ β
β β β β
β β 4. Connect LiveKit β β
β ββ β β β β β β β β β ββ β
β β β β
β 5. Speak "Hello" β β β
ββββββββββββββββββββββ>β 6. Audio stream β β
β ββββββββββββββββββββββ>β β
β β β 7. STT: "Hello" β
β β ββββββββββββββββββ>β
β β β 8. AI response β
β β β<ββββββββββββββββββ
β β 9. TTS audio β β
β β<=βββββββββββββββββββββ β
β 10. Hear response β β β
β<ββββββββββββββββββββββ β β
β β β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMPONENT RESPONSIBILITIES β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β REACT FRONTEND (react-vite-ui) β β
β β βββββββββββββββββββββββββββββββββ β β
β β - Renders voice chat UI β β
β β - Captures microphone audio β β
β β - Displays transcriptions and AI responses β β
β β - Plays AI voice responses β β
β β - Manages connection state β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β NESTJS BACKEND (nestjs-ai-sdk-server) β β
β β βββββββββββββββββββββββββββββββββββββ β β
β β - Generates LiveKit tokens β β
β β - Runs VoiceAgent (STT listener + AI responder + TTS speaker) β β
β β - Integrates Google Gemini for AI responses β β
β β - Manages conversation history per room β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β SAYNA PLATFORM (external) β β
β β βββββββββββββββββββββββββββ β β
β β - Provides STT (Deepgram, Google, etc.) β β
β β - Provides TTS (ElevenLabs, Google, etc.) β β
β β - Manages LiveKit rooms β β
β β - Handles real-time audio streaming β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| SDK | Package | Used In | Purpose |
|---|---|---|---|
| JS SDK | @sayna-ai/js-sdk |
Frontend | Browser voice rooms |
| Node SDK | @sayna-ai/node-sdk |
Backend | Server voice processing |
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Best for: Web apps with AI-powered voice assistants β
β β
β βββββββββββββββ REST βββββββββββββββ WebSocket ββββββββββ β
β β Browser ββββββ/startβββ>β Server ββββββββββββββββ>β Sayna β β
β β β β β β β β
β β JS SDK ββββLiveKitββββ>β Node SDK β<ββββββββββββββ>β β β
β β β (audio) β β (STT/TTS) β β β
β βββββββββββββββ ββββββββ¬βββββββ ββββββββββ β
β β β
β βΌ β
β βββββββββββββββ β
β β AI Model β β
β β (Gemini) β β
β βββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Resource | Link |
|---|---|
| Sayna Docs | docs.sayna.ai |
| Sayna Repository | github.com/saynaai/sayna |
| SDK Repository | github.com/saynaai/saysdk |