TTS Proxy

A multi-service Text-to-Speech proxy with intelligent voice routing, supporting multiple TTS engines and providing a unified REST API for speech synthesis.

📋 Table of Contents

✨ Features
🚀 Quick Start
🔌 API Usage
⚙️ Configuration
📂 Project Structure
🔒 Security Notes
🔧 Troubleshooting
📄 License
🤝 Contributing

✨ Features

Multi-Service Architecture: Intelligent routing between different TTS engines
Smart Voice Selection: Automatic service selection based on voice preferences
High-Quality German Voices: Thorsten (male) and CSS10 (female) voices
Docker Support: Complete containerized setup optimized for CPU
RESTful API: Clean, simple API compatible with various frontends
Authentication: API key-based security
Health Monitoring: Built-in health check endpoints
CPU Optimized: Efficient processing on standard hardware

🚀 Quick Start

📋 Prerequisites

Docker and Docker Compose
Multi-core CPU recommended for optimal performance
Node.js 18+ (for local development)

🐳 Docker Deployment (Recommended)

Clone the repository:

git clone https://github.com/loonylabs-dev/tts-proxy.git
cd tts-proxy

Set up environment:

cp .env.example .env
# Edit .env with your API key

Start the services:
```
docker compose up -d
```

Test the service:

curl -X POST "http://localhost:3000/api/tts" \
  -H "x-api-key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello World", "voice": "thorsten_male"}' \
  --output test.wav

💻 Local Development

Install dependencies:
```
cd proxy
npm install
```

Set up environment:

cp .env.example .env
# Edit .env with your configuration

Start TTS services (Docker):
```
docker compose up -d thorsten css10
```

Start the proxy:

npm run dev          # Development mode
# or
npm run build && npm start  # Production mode

🔌 API Usage

The proxy provides a RESTful API for text-to-speech conversion with intelligent voice routing.

POST /api/tts

Generate speech from text with automatic service selection.

Request Parameters:

{
  "text": "Text to be spoken",
  "voice": "voice_id",         // See voice options below
  "language": "de|en",         // Optional, auto-detected
  "speed": 1.0,               // Optional, default 1.0
  "pitch": 1.0                // Optional, default 1.0
}

Voice Selection Options:

Structured Voice IDs (Recommended):
- "thorsten_male" - Thorsten German male voice
- "css10_female" - CSS10 German female voice
Gender Keywords:
- "male" / "männlich" - Routes to Thorsten
- "female" / "weiblich" - Routes to CSS10
Service Names:
- "thorsten" - Direct routing to Thorsten service
- "css10" - Direct routing to CSS10 service

Examples:

# German male voice
curl -X POST "http://localhost:3000/api/tts" \
  -H "x-api-key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"text": "Guten Tag, wie geht es Ihnen?", "voice": "thorsten_male"}' \
  --output german_male.wav

# German female voice
curl -X POST "http://localhost:3000/api/tts" \
  -H "x-api-key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hallo, ich bin eine weibliche Stimme", "voice": "css10_female"}' \
  --output german_female.wav

# Auto-detection based on gender
curl -X POST "http://localhost:3000/api/tts" \
  -H "x-api-key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello World", "voice": "male"}' \
  --output english_male.wav

Response: WAV audio file

GET /api/tts/voices

List available voices and their capabilities.

Response:

{
  "voices": [
    {
      "id": "thorsten_male",
      "name": "Thorsten (Male)",
      "description": "Warm German male voice, very natural",
      "gender": "male",
      "quality": "high",
      "languages": ["de", "en"],
      "service": "thorsten",
      "model": "tts_models/de/thorsten/vits"
    },
    {
      "id": "css10_female",
      "name": "CSS10 (Female)",
      "description": "Clear German female voice with good pronunciation",
      "gender": "female",
      "quality": "high",
      "languages": ["de"],
      "service": "css10",
      "model": "tts_models/de/css10/vits-neon"
    }
  ]
}

Other Endpoints

GET /health - Health check for all services
GET /debug/coqui - Debug endpoint for service testing

⚙️ Configuration

🔧 Environment Variables

Variable	Default	Description
`API_KEY`	Required	Authentication key for API access
`TTS_TYPE`	`coqui`	TTS engine type
`TTS_THORSTEN_URL`	`http://thorsten:5002`	Thorsten service URL
`TTS_CSS10_URL`	`http://css10:5003`	CSS10 service URL
`PORT`	`3000`	Proxy server port (local dev only)

🐳 Docker Configuration

The Docker setup includes:

Thorsten container: German male voice (Coqui TTS)
CSS10 container: German female voice (Coqui TTS)
Proxy container: API proxy with intelligent routing
Optional Cloudflare tunnel: For external access

🔧 CPU Configuration

The setup is optimized for CPU processing:

6GB memory limit per TTS service
Multi-core CPU utilization
No GPU dependencies required
Runs on standard Docker setups

Note: For GPU acceleration, see docker-compose.gpu-backup.yml for reference configuration.

📂 Project Structure

tts-proxy/
├── proxy/                        # TypeScript proxy server
│   ├── src/index.ts             # Main proxy logic
│   ├── dist/                    # Compiled JavaScript
│   └── package.json             # Dependencies
├── f5-tts-service/              # Thorsten TTS service
│   └── Dockerfile               # Coqui TTS container
├── css10/                       # CSS10 TTS service
│   └── Dockerfile               # Coqui TTS container
├── docker-compose.yml           # Service orchestration
├── .env.example                 # Environment template
└── README.md                    # This file

🔒 Security Notes

Keep your API key secure and never commit it to version control
Use the provided .env.example as a template for configuration
Cloudflare tunnel credentials are excluded from git tracking
The proxy requires valid API keys for all requests (except health checks)
Internal communication uses Docker networks for security

🔧 Troubleshooting

API Connection Issues

401 Unauthorized:

Check API_KEY in .env file
Ensure x-api-key header is correct in requests
Health endpoint (/health) doesn't require authentication

TTS Service Unreachable:

Check Docker containers: docker compose ps
Verify GPU access: docker compose logs thorsten
Test direct service: curl http://localhost:5002/

Voice Issues

Model Not Found:

Check available voices: GET /api/tts/voices
Verify TTS service logs: docker compose logs css10
Ensure containers are fully started (first startup takes longer)

Audio Quality Issues:

Adjust speed/pitch parameters
Try different voice options
Check input text language matches voice capabilities

Performance Issues

High CPU Usage:

TTS processing is CPU-intensive by design
Monitor CPU usage: docker stats
Consider upgrading to multi-core CPU for better performance

Slow Response Times:

First request per container is slower (model loading)
Consider keeping containers warm with health checks
Monitor GPU memory usage

Docker Issues

Container Won't Start:

Check logs: docker compose logs
Verify .env file exists and contains API_KEY
Ensure sufficient disk space for Docker images

Port Conflicts:

Setup uses internal Docker networking by default
Modify docker-compose.yml if external access needed
Check no other services use ports 5002, 5003

📄 License

MIT License - see LICENSE file for details.

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

For questions or support, please open an issue on GitHub.

Commercial Use: All TTS models used are commercially licensed (CC0/Apache 2.0). See CLAUDE.md for detailed licensing information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TTS Proxy

✨ Features

🚀 Quick Start

🐳 Docker Deployment (Recommended)

🔌 API Usage

POST /api/tts

GET /api/tts/voices

Other Endpoints

⚙️ Configuration

📂 Project Structure

🔒 Security Notes

🔧 Troubleshooting

API Connection Issues

Voice Issues

Performance Issues

Docker Issues

📄 License

🤝 Contributing

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
cloudflared		cloudflared
css10		css10
proxy		proxy
thorsten		thorsten
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
docker-compose.gpu-backup.yml		docker-compose.gpu-backup.yml
docker-compose.yml		docker-compose.yml

License

loonylabs-dev/tts-proxy

Folders and files

Latest commit

History

Repository files navigation

TTS Proxy

✨ Features

🚀 Quick Start

🐳 Docker Deployment (Recommended)

🔌 API Usage

POST /api/tts

GET /api/tts/voices

Other Endpoints

⚙️ Configuration

📂 Project Structure

🔒 Security Notes

🔧 Troubleshooting

API Connection Issues

Voice Issues

Performance Issues

Docker Issues

📄 License

🤝 Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages