OpenAI-Compatible Server for Coqui-TTS (XTTS2-Focused)

What It Is

This repo contains an OpenAI-compatible server for Coqui-TTS, adapted from TTS/server/server.py.

Watch the video showcase and tutorial on YouTube

What It Does

This server allows you to use XTTS2 local TTS models as a drop-in replacement for OpenAI TTS models.

The primary use case is integration with WingmanAI (wingman-ai.com), offering:

Local voice cloning
Additional TTS options
No reliance on paid services like ElevenLabs

Key Enhancements Over Base Repo

--lowvram mode: moves TTS model to CPU when idle (saves ~1.5GB VRAM with XTTS2)
Ensures correct language segmenter is used for splitting long text
(Planned) Support for pre-made XTTS2 latents in generation

How to Install

You have three installation options:

✅ Premade .exe for Windows (Experimental)
🛠️ Use custom server from this repo with Python
⚙️ Use original idiap/coqui-ai-TTS server with Python

🪟 Option 1: Premade `.exe` (Windows Only)

Pros:

No Python/coding knowledge needed
Mostly pre-packaged
Quickest setup

Cons:

Antivirus may flag the .exe
Minimal testing
No auto-updates
Windows only
Trust required for the download

Installation Steps:

Download ZIP (~5GB) (mega.nz download) or here (mediafire download)
Unzip anywhere (avoid OneDrive-controlled folders)
If warned, click “Keep Anyway”
Double-click run_server.bat
Allow network access when prompted
Follow menu to select language and GPU/CPU

You’re now running! 🎉 Proceed to WingmanAI Configuration.

🛠️ Option 2: Use Custom Server from This Repo

Pros:

Custom WingmanAI features (e.g., lowvram)
Open source
Cross-platform support

Cons:

Not automatically synced with base repo
Requires more steps
Trust needed (or read the code)

Installation Steps:

Download this repo as ZIP
Unzip (avoid OneDrive folders)
Install pyenv-win
Open terminal in the unzipped folder

Run:

pyenv install 3.11.7
pyenv local 3.11.7
python -m venv venv
.\venv\Scripts\activate
pip install torch torchaudio  (if using cpu) or pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128  (if using Nvidia gpu)
pip install -r requirements.txt

Download XTTS2 model files from: huggingface.co/coqui/XTTS-v2 into the xtts_model folder

To Run the Server:

Open the project folder
Double-click run_server_with_python.bat
Follow prompts to choose language and GPU/CPU

You’re now running! 🎉 Proceed to WingmanAI Configuration.

🐍 Option 3: Use `idiap/coqui-ai-TTS` Server with Python

Pros:

Trusted, long-standing repo
Open source
Automatic updates
Works on all OS

Cons:

No lowvram mode (uses ~3-4GB VRAM idle on GPU)
No support for pre-made latents

Installation Steps:

Create a folder (e.g. Coqui-TTS-Server)
Install pyenv-win
Open terminal in that folder

Run:

pyenv install 3.11.7
pyenv local 3.11.7
python -m venv venv
.\venv\Scripts\activate

(Optional for NVIDIA GPU):

pip install --pre torch torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128

Then:
```
pip install coqui-tts[server,languages]
```

To Run the Server:

.\venv\Scripts\activate
tts-server --model_name tts_models/multilingual/multi-dataset/xtts_v2

Optional flags:

Add --use_cuda to run on GPU
Add --language_idx de (or other language code)

Example: tts-server --model_name tts_models/multilingual/multi-dataset/xtts_v2 --use_cuda --language_idx de

On the first run, the program should automatically install the TTS model (XTTS2). You may have to indicate consent to the license during the download process.

You’re now running! 🎉 Proceed to WingmanAI Configuration.

🛠️ WingmanAI Configuration

Start the TTS server using any method above
Open WingmanAI
Choose a Wingman and click the 🔧 config wrench
Under Text to Speech, choose Local OpenAI Compatible TTS
Click the ⚙️ configuration wheel
Enter:
- URL: http://localhost:5002/v1
- Model: XTTS2 (or anything else; just a placeholder)
Adjust Speed to control speech rate
Choose a Voice from the built-ins, or:
- Use a .wav file (1 voice sample) of the speaker whose voice you want to clone (recommended: 5-10 seconds, mono, 22050hz)
- Or a folder of .wav files (multiple samples) of the speaker whose voice you want to clone (recommended 3-6 samples, each 5-10 seconds, mono, 22050hz)
For the Voice enter either the name of a built in voice (see list below) or enter the path to the .wav file or speaker folder (use /, not \) you made above (recommend placing these in the cloning_wavs folder if using this repo or the exe or another similar folder if using the coqui-tts repo).

🎙️ Built-in Voices


Claribel Dervla	Dervla Studious	Gracie Wise	Tammie Ema
Alison Dietlinde	Ana Florence	Annmarie Nele	Asya Anara
Brenda Stern	Gitta Nikolina	Henriette Usha	Sofia Hellen
Tammy Grit	Tanja Adelina	Vjollca Johnnie	Andrew Chipper
Badr Odhiambo	Dionisio Schuyler	Royston Min	Viktor Eka
Abrahan Mackdde Michal	Baldur Sanjin	Craig Gutsy	Damien Black
Gilberto Mathias	Ilkin Urbano	Kazuhiko Atallah	Ludvig Milivoj
Suad Qasim	Torcull Diarmuid	Viktor Menelaos	Zacharie Aimilios
Nova Hogarth	Maja Ruoho	Uta Obando	Lidiya Szekeres
Chandra MacFarland	Szofi Granger	Camilla Holmström	Lilya Stainthorpe
Zofija Kendrick	Narelle Moon	Barbora MacLean	Alexandra Hisakawa
Alma María	Rosemary Okafor	Ige Behringer	Filip Traverse
Damjan Chapman	Wulf Carlevaro	Aaron Dreschner	Kumar Dahl
Eugenio Mataracı	Ferran Simen	Xavier Hayasaka	Luis Moray
Marcos Rudaski

✅ With the server running, open http://localhost:5002 in your browser to try out all the voices in a demo UI.

Save your Wingman

Your Wingman now speaks with XTTS2! 🗣️✨

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets		assets
cloning_wavs		cloning_wavs
static		static
templates		templates
xtts_model/main		xtts_model/main
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyinstaller_build_command_lowvram.txt		pyinstaller_build_command_lowvram.txt
requirements.txt		requirements.txt
run_server.bat		run_server.bat
run_server_with_python.bat		run_server_with_python.bat
server-with-low-vram-and-stream.py		server-with-low-vram-and-stream.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenAI-Compatible Server for Coqui-TTS (XTTS2-Focused)

What It Is

What It Does

Key Enhancements Over Base Repo

How to Install

🪟 Option 1: Premade `.exe` (Windows Only)

🛠️ Option 2: Use Custom Server from This Repo

🐍 Option 3: Use `idiap/coqui-ai-TTS` Server with Python

🛠️ WingmanAI Configuration

🎙️ Built-in Voices

About

Uh oh!

Releases

Packages

Languages

License

ShipBit/WingmanAI-Coqui-TTS-Openai-Server

Folders and files

Latest commit

History

Repository files navigation

OpenAI-Compatible Server for Coqui-TTS (XTTS2-Focused)

What It Is

What It Does

Key Enhancements Over Base Repo

How to Install

🪟 Option 1: Premade .exe (Windows Only)

🛠️ Option 2: Use Custom Server from This Repo

🐍 Option 3: Use idiap/coqui-ai-TTS Server with Python

🛠️ WingmanAI Configuration

🎙️ Built-in Voices

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

🪟 Option 1: Premade `.exe` (Windows Only)

🐍 Option 3: Use `idiap/coqui-ai-TTS` Server with Python

Packages