Skip to content

nandinigthub/VoiceAssistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Voice Assistant with VideoSDK

A voice assistant that integrates VideoSDK, a RAG pipeline, and two custom APIs for document handling: one to upload PDFs to a vector database and another to search relevant content.

The system provides a full voice flow:

STT → RAG (docs) → LLM → TTS

with a custom VideoSDK plugin for seamless integration.

Features

  • Voice Interaction: Real-time speech recognition using Deepgram STT.
  • Document Retrieval: Upload PDFs to a Qdrant vector database and retrieve context for LLM responses.
  • RAG Pipeline: Retrieves top-k relevant chunks from uploaded documents for more accurate answers.
  • Text-to-Speech: ElevenLabs TTS reads out LLM responses.
  • Custom VideoSDK Plugin: Integrates STT, RAG, LLM, and TTS in a single cascading pipeline.
  • Interruptible Conversation: User speech interrupts ongoing TTS or LLM generation.

Setup

  1. Sample .env
OPENAI_API_KEY=<your_openai_api_key>
DEEPGRAM_API_KEY=<your_deepgram_api_key>
ELEVENLABS_API=<your_elevenlabs_api_key>
ROOM_ID=<videosdk_room_id>
AUTH_TOKEN=<videosdk_auth_token>
VECTOR_DB_URL=<qdrant_url>
VECTOR_DB_API_KEY=<qdrant_api_key>
  1. Locally run qdrant
docker pull qdrant/qdrant 
  1. Create and activate virtual env
py -m venv venv
venv\Scripts\activate
  1. Install requirements
pip install -r requirements.txt 
  1. For document upload and retrieve
cd backend/
uvicorn main:app --reload --port 8000 
  1. For Voice Agent (in console)
python voice_agent.py console

Sample Queries

user_input: Guide to Videosdk integration with rag pipeline?
agent_output: Build an AI Agent with RAG using VideoSDK Agents SDK Goal Your task is to build a voice AI agent using the VideoSDK Agents SDK.​T.​.....   (from document uploaded)

user_input: Explain human heart?
agent_output: The human heart is a muscular organ, roughly the size of a fist, located in the chest that pumps blood throughout the body... (llm response)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages