Multi-character AI voice agent with Knowledge Graph memory.
Self-contained project with all models and dependencies bundled. Designed for Hybrid environments: Run the heavy AI core (IndexTTS2) in WSL for Linux performance, while keeping your tools (LM Studio, Microphone) on Windows.
- Multi-Character: Isolated memory graphs and voices (Hermione, Lisbeth, Assistant).
- Graph Memory: SQLite-based Knowledge Graph with GraphRAG community detection.
- Agent Skills: MCP tools, Everything search, web search, file access, and more.
- Dual TTS: IndexTTS2 (voice cloning) or Kokoro (fast ONNX, ~80ms latency).
- Emotion Detection: wav2vec2-based speech emotion recognition.
- Hybrid Architecture: Seamlessly bridges WSL (AI) and Windows (Audio/Tools).
- Flexible AI: Supports OpenRouter (Cloud) and LM Studio (Local).
New User? Read the Installation Guide for detailed setup.
Right-click VoiceChat.bat → Run as administrator
Note: Administrator privileges are required for WSL2 installation and keyboard hooks (PTT).
Select Option [5] Install Dependencies.
- Sets up Python venv in WSL (for AI).
- Sets up Audio tools in Windows.
Edit config.env:
OPENROUTER_API_KEY=sk-or-v1-your-key...
# LM_STUDIO_HOST= <-- Leave commented for Auto-Detection!Select Option [1] Start Voice Chat.
Moving to a new PC? This project is now fully self-contained!
Quick summary:
- Copy
C:\AI\tts2-voice-agentto external drive (~9GB) - On new PC: Copy folder and run
python setup_new_pc.py
All models and dependencies are bundled in the models/ and lib/ directories.
WINDOWS HOST
┌──────────────┐ ┌──────────────┐ ┌─────────────┐
│ PTT / VAD │ │ LM Studio │ │ Microphone │
│ (Python) │ │ (Local AI) │ │ (Input) │
└──────┬───────┘ └──────▲───────┘ └──────┬──────┘
│ Audio │ HTTP │ Audio
▼ ▼ ▼
═══════╪══════════════════╪══════════════════╪═══════════
│ │ │
│ WSL (UBUNTU) │
┌──────▼───────┐ ┌──────┴───────┐ ┌──────▼──────┐
│ App Server │◄─►│ LLM Client │ │ Whisper │
│ (Gradio UI) │ │ (Logic) │ │ STT │
└──────┬───────┘ └──────┬───────┘ └─────────────┘
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐ ┌─────────────┐
│ Services │ │ Graph Memory │ │ Emotion │
│ Layer ◆ │ │ + GraphRAG │ │ Detection │
└──────┬───────┘ └──────────────┘ └─────────────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ IndexTTS2 │ │ Kokoro │
│ (High Fid) │ │ (Fast) │
└──────────────┘ └──────────────┘
◆ Phase 1 Complete: Service layer extracted for better maintainability
Access the voice agent from your phone or any device on your network.
- Right-click
VoiceChat.bat→ Run as administrator - Select Option [2] Mobile/Remote Access
- A public HTTPS URL will be generated (e.g.,
https://xxxxx.gradio.live) - Open the URL on your phone
- Mobile PTT Button - Large touch-friendly "HOLD TO TALK" button
- PWA Support - Install as a webapp on your phone (Add to Home Screen)
- HTTPS Required - Microphone access requires secure connection (handled automatically)
- The Gradio share URL is temporary (72 hours max)
- For permanent access, consider Cloudflare Tunnel
- PWA install: On mobile browser, tap Share > "Add to Home Screen"
| Action | Key / Mode |
|---|---|
| Talk (PTT) | Hold Right Shift (Release to send) |
| Mobile PTT | Hold the "HOLD TO TALK" button |
| Hands-Free | Toggle "Hands-Free Mode" in UI |
| Stop Audio | Click "Stop" in UI |
| Backend | Speed | Quality | VRAM | Voice Clone |
|---|---|---|---|---|
| IndexTTS2 | ~800ms | Excellent | 4-6GB | Yes (5s sample) |
| Kokoro | ~80ms | Good | ~500MB | Preset voices |
Switch in Settings > Audio Settings > TTS Backend.
| Tool | Description |
|---|---|
everything_search |
PC-wide file search (requires Everything) |
web_search |
DuckDuckGo search (no API key) |
wikipedia |
Wikipedia lookups |
read_file / write_file |
Sandboxed file access |
create_skill |
Agent creates new skills |
| MCP Tools | Via mcp_config.json |
- Installation Guide - Step-by-step setup.
- User Manual - Complete user guide.
- Architecture Visuals - Diagrams of Hybrid System & Memory.
- Technical Reference - Deep dive into Graph Memory & MCP.
- Migration Guide - Moving to a new PC.
- Disclaimer - Legal & Ethical guidelines.
| Issue | Solution |
|---|---|
| LM Studio Orange/Red | Run as Admin, Port 1235. See Docs. |
| PTT Not Working | Use Right Shift. Ensure VoiceChat.bat is running. |
| Connection Refused | Check config.env. Comment out LM_STUDIO_HOST. |
| Emotion always "sad" | Likely audio issue - check mic levels |
MIT License. See LICENSE for details.