feat: Add WebSocket support for real-time TTS streaming with multi-user capability #356

Robinbinu · 2026-01-11T15:37:32Z

Summary

Implements a WebSocket endpoint for real-time text-to-speech streaming, enabling bidirectional communication and support for multiple concurrent users. Includes a complete demo client and enhanced web UI with mode switching.

Key Features

WebSocket Endpoint (`/ws`)

Real-time TTS streaming with bidirectional communication
Multi-user support - each connection has dedicated request queue
Base64-encoded audio chunks with WAV headers
Graceful error handling for unsupported engines
Auto-queuing - processes text requests sequentially per connection

Enhanced Web Interface

Mode toggle - switch between HTTP and WebSocket modes
Auto-send on typing pause - sends text after 1 second of inactivity
Dynamic UI - hides "Speak" button in WebSocket mode
Text field auto-clear after sending

WebSocket Client Demo (`websocket_client.py`)

Real-time audio playback using PyAudio
Audio file export - save received audio to WAV files
CLI support - custom test messages via command line arguments
Progress indicators with visual feedback
Graceful connection handling and cleanup

Engine Improvements

Dynamic engine initialization - auto-selects first available engine
Graceful credential handling - skips engines with missing API keys
Engine name tracking - fixes voice retrieval errors
System engine protection - prevents WebSocket use with pyttsx3 (not thread-safe)

Technical Details

Engine Compatibility

✅ WebSocket-compatible: OpenAI, Kokoro, Azure, ElevenLabs
❌ Not compatible: System engine (pyttsx3) - displays clear error message

Dependencies

websockets - WebSocket client/server
pyaudio - Audio playback in demo client

Files Changed

async_server.py - WebSocket endpoint, engine tracking, UI enhancements
static/tts.js - WebSocket client logic, mode switching, auto-send
websocket_client.py - Python demo client (new)
README.md - Updated documentation

Breaking Changes

None - all changes are additive and backward compatible.

Testing

Tested with OpenAI and Kokoro engines. WebSocket mode successfully handles:

Multiple concurrent connections
Rapid text input with auto-send
Engine switching mid-session
Connection interruption and recovery

… logic

… file saving

…io streaming

…improve client feedback

…proved voice retrieval

…cket support for concurrent users

Robinbinu · 2026-01-11T15:42:33Z

Hi @KoljaB
Just added WebSocket support to the FastAPI server - saw in the README this was a pending feature for handling text chunk-by-chunk for LLM integration. It's working now!

What's new:

/ws endpoint for real-time streaming text input
Multiple concurrent users supported
Web UI toggle between HTTP/WebSocket modes
Auto-send after 1 sec typing pause
Demo client included (websocket_client.py)

Perfect for LLM integrations - each user gets their own queue so multiple conversations can run simultaneously. Works with OpenAI, Kokoro, Azure, and ElevenLabs engines.

Everything's backward compatible.

Robinbinu · 2026-01-12T11:43:26Z

Hi @KoljaB ,
Would love your feedback when you get a chance, and feel free to merge whenever you have time!

KoljaB · 2026-01-12T14:47:45Z

Thank you so much. This looks like a great PR. I'm in Spain for the next two months, so can't really look into it for a while.

…ines

Robinbinu · 2026-01-19T06:44:12Z

Hi @KoljaB, please take your time. I was testing and found a few bugs related to handling different audio formats from different providers. They are fixed in the latest commit #57bbfcb

Kyutai Labs' Pocket TTS - lightweight 100M parameter model with: - CPU-optimized inference (~6x real-time performance) - Voice cloning via WAV files - ~200ms latency to first audio chunk - 8 built-in voices Install with: pip install pocket-tts

Robinbinu added 6 commits January 11, 2026 21:02

feat: add WebSocket support for TTS and improve engine initialization…

bd27174

… logic

feat: implement WebSocket client demo for TTS with audio playback and…

c5bb56d

… file saving

feat: enhance TTS UI with mode selector and WebSocket support for aud…

0fa82e5

…io streaming

feat: add WebSocket error handling for unsupported system engine and …

cbf090e

…improve client feedback

chore: refactor engine management to track current engine name for im…

e2c6ce8

…proved voice retrieval

docs: update README to enhance clarity on server components and WebSo…

a32e52f

…cket support for concurrent users

Robinbinu marked this pull request as ready for review January 11, 2026 15:37

fix: add audio format detection and conversion for non-Kokoro TTS eng…

57bbfcb

…ines

Robinbinu added 3 commits January 19, 2026 12:26

feat: add NeuTTS engine for voice cloning TTS

f1effff

merge: integrate pocket-tts and neutts engines

72f0627

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add WebSocket support for real-time TTS streaming with multi-user capability #356

feat: Add WebSocket support for real-time TTS streaming with multi-user capability #356

Uh oh!

Robinbinu commented Jan 11, 2026

Uh oh!

Robinbinu commented Jan 11, 2026 •

edited

Loading

Uh oh!

Robinbinu commented Jan 12, 2026

Uh oh!

KoljaB commented Jan 12, 2026

Uh oh!

Robinbinu commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add WebSocket support for real-time TTS streaming with multi-user capability #356

Are you sure you want to change the base?

feat: Add WebSocket support for real-time TTS streaming with multi-user capability #356

Uh oh!

Conversation

Robinbinu commented Jan 11, 2026

Summary

Key Features

WebSocket Endpoint (/ws)

Enhanced Web Interface

WebSocket Client Demo (websocket_client.py)

Engine Improvements

Technical Details

Engine Compatibility

Dependencies

Files Changed

Breaking Changes

Testing

Uh oh!

Robinbinu commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Robinbinu commented Jan 12, 2026

Uh oh!

KoljaB commented Jan 12, 2026

Uh oh!

Robinbinu commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WebSocket Endpoint (`/ws`)

WebSocket Client Demo (`websocket_client.py`)

Robinbinu commented Jan 11, 2026 •

edited

Loading