🗣️ Speech-to-Speech Retrieval-Augmented Assistant (Bruce)

Bruce is an intelligent voice assistant pipeline that listens, understands, retrieves, reasons, and speaks — enabling seamless speech-to-speech interaction powered by Retrieval-Augmented Generation (RAG), whisper-based transcription, and wake-word activation via Porcupine.

🚀 Features

🎙️ Wake Word Detection: Trigger interaction with the phrase “Hey Bruce” using Picovoice Porcupine.
🧠 RAG-based Reasoning: Context-aware generation using an embedded vector database (faiss_index.idx) with chunked document retrieval.
🗣️ Speech-to-Text + Text-to-Speech: Uses Whisper for transcription and pyttsx3 (or compatible engine) for natural voice synthesis.
⚡ Real-time Interaction: Listens for user input, queries a document corpus, and returns spoken responses in a human-like loop.

🧱 Project Structure

├── S2S_v8.py / S2S_v9.py # Main assistant logic
├── .env # Environment variables (API keys, config)
├── requirements.txt # Python dependencies
├── doc_chunks.pkl # Pickled document chunks for retrieval
├── embeddings.npy # Pre-computed document embeddings
├── faiss_index.idx # FAISS index for fast similarity search
├── Hey-Bruce_en_windows_v3_0_0.ppn # Porcupine wake word model
├── info.txt # Text Corpus of which the RAG system wpuld make retrieval from

🛠️ Setup Instructions

1. Set up Python environment

pip install -r requirements.txt

2. Add `.env` file

Create a .env file and add required environment variables:

TOGETHER_API_KEY=your_api_key
PORCUPINE_API_KEY = your_api_key
GROQ_API_KEY = your_api_key
OPEN_API = your_api_key

3. Run the assistant

python S2S_v9.py

🧠 How It Works

🔊 Wake Word Detection

Listens for the wake word “Hey Bruce” using a .ppn model via Porcupine.

🗣️ Speech-to-Text (STT)

Captures and transcribes user speech using OpenAI’s Whisper.

🔍 Contextual Retrieval

Searches document embeddings using FAISS to find the most relevant context from doc_chunks.pkl.

🧠 Language Generation

Sends query + context to a powerful language model (e.g., TogetherAI LLaMA) and retrieves the response.

🔁 Text-to-Speech (TTS)

Converts the generated response back to voice using pyttsx3.

📦 Dependencies

Key libraries (see requirements.txt):

openai-whisper
pyttsx3
faiss-cpu
python-dotenv
numpy
pickle
sounddevice
pvporcupine (Porcupine wake word detection)

🧪 Example Use Case

You: "Hey Bruce, what is retrieval-augmented generation?"
Assistant: (searches documents)
Response: "Retrieval-Augmented Generation (RAG) is a method that improves language models by providing external context from a document database..."

📍 Roadmap

Add multilingual support
Improve response latency via local quantized models
Add GUI interface
Deploy as web API for Raspberry Pi/Edge use

🔐 Disclaimer

Ensure you setup your personalized porcupine model (.ppn file) for the catch phrase 'Hey Bruce'
Also make sure you have an 'info.txt' file which would serve as the text corpus knowledge base for the RAG system.

🤝 Acknowledgements

📜 License

This project is licensed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
COMPLETION_SUMMARY.md		COMPLETION_SUMMARY.md
Hey-Bruce_en_macos_v3_0_0.ppn		Hey-Bruce_en_macos_v3_0_0.ppn
Hey-Bruce_en_windows_v3_0_0.ppn		Hey-Bruce_en_windows_v3_0_0.ppn
IMPROVEMENTS.txt		IMPROVEMENTS.txt
IMPROVEMENTS_SUMMARY.md		IMPROVEMENTS_SUMMARY.md
INDEX.md		INDEX.md
LICENSE		LICENSE
MODULAR_GUIDE.md		MODULAR_GUIDE.md
QUICK_REFERENCE.md		QUICK_REFERENCE.md
README.md		README.md
S2S_IMG.png		S2S_IMG.png
S2S_v10.py		S2S_v10.py
S2S_v8.py		S2S_v8.py
S2S_v9.py		S2S_v9.py
START_HERE.md		START_HERE.md
doc_chunks.pkl		doc_chunks.pkl
embeddings.npy		embeddings.npy
faiss_index.idx		faiss_index.idx
info.txt		info.txt
kb_manager.py		kb_manager.py
prompt_manager.py		prompt_manager.py
rag_diagnostics.py		rag_diagnostics.py
rag_system.py		rag_system.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🗣️ Speech-to-Speech Retrieval-Augmented Assistant (Bruce)

🚀 Features

🧱 Project Structure

🛠️ Setup Instructions

1. Set up Python environment

2. Add `.env` file

3. Run the assistant

🧠 How It Works

🔊 Wake Word Detection

🗣️ Speech-to-Text (STT)

🔍 Contextual Retrieval

🧠 Language Generation

🔁 Text-to-Speech (TTS)

📦 Dependencies

🧪 Example Use Case

📍 Roadmap

🔐 Disclaimer

🤝 Acknowledgements

📜 License

About

Uh oh!

Releases

Packages

Languages

License

Mubarraqqq/STS_Robot

Folders and files

Latest commit

History

Repository files navigation

🗣️ Speech-to-Speech Retrieval-Augmented Assistant (Bruce)

🚀 Features

🧱 Project Structure

🛠️ Setup Instructions

1. Set up Python environment

2. Add .env file

3. Run the assistant

🧠 How It Works

🔊 Wake Word Detection

🗣️ Speech-to-Text (STT)

🔍 Contextual Retrieval

🧠 Language Generation

🔁 Text-to-Speech (TTS)

📦 Dependencies

🧪 Example Use Case

📍 Roadmap

🔐 Disclaimer

🤝 Acknowledgements

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

2. Add `.env` file

Packages