Mac Speech Services

Speech-to-Text & Text-to-Speech servers optimized for Apple Silicon

Features • Installation • Usage • API • License

Lightweight FastAPI servers for on-device speech processing using MLX-optimized models. Runs entirely on your Mac with no cloud dependencies.

Features

🎤 Speech-to-Text (STT)

Parakeet-MLX model (110M parameters)
OpenAI-compatible /v1/audio/transcriptions endpoint
Real-time transcription on Apple Silicon
~115MB RAM usage

🔊 Text-to-Speech (TTS)

Kokoro-MLX model (82M parameters)
43+ high-quality voices
Simple HTTP API
~400MB RAM usage

⚡ Apple Silicon Optimized

Uses Apple's MLX framework for GPU acceleration
Unified memory architecture support
Runs locally - no internet required after model download

Installation

# Clone the repository
git clone https://github.com/[your-username]/mac-speech-services.git
cd mac-speech-services

# Create virtual environment
python3.12 -m venv venv
source venv/bin/activate

# Install dependencies for both services
pip install -r stt-service/requirements.txt
pip install -r tts-service/requirements.txt

Usage

Quick Start - Run Both Services

# Start both STT and TTS
./start-all.sh

Or Run Individually

# Terminal 1 - STT on port 8001
./start-stt.sh

# Terminal 2 - TTS on port 8002
./start-tts.sh

Test the Services

# Check STT health
curl http://localhost:8001/health

# Check TTS health
curl http://localhost:8002/health

# Transcribe audio (STT)
curl -X POST http://localhost:8001/v1/audio/transcriptions \
  -F file=@audio.wav

# Generate speech (TTS)
curl -X POST http://localhost:8002/tts \
  -F text="Hello world" \
  -F voice=af_bella \
  --output speech.wav

API Reference

STT Service (Port 8001)

Endpoint	Method	Description
`/health`	GET	Health check
`/v1/audio/transcriptions`	POST	Transcribe audio file

Transcribe Request:

curl -X POST http://localhost:8001/v1/audio/transcriptions \
  -F file=@audio.wav \
  -F response_format=json

Response:

{
  "text": "transcribed text here"
}

TTS Service (Port 8002)

Endpoint	Method	Description
`/health`	GET	Health check
`/voices`	GET	List available voices
`/tts`	POST	Generate speech

TTS Request:

curl -X POST http://localhost:8002/tts \
  -F text="Hello world" \
  -F voice=af_bella \
  -F speed=1.0 \
  --output speech.wav

Available Voices:

af_bella, af_heart, af_nicole, af_sky (American Female)
am_adam, am_michael (American Male)
bf_emma, bf_isabella (British Female)
bm_george, bm_lewis (British Male)
And 30+ more...

System Requirements

macOS with Apple Silicon (M1/M2/M3)
Python 3.12+
~520MB RAM for both services
Models download on first use (~500MB total)

Network Access

The services bind to 0.0.0.0 and are accessible from:

Localhost: http://localhost:8001 / http://localhost:8002
Network: http://<your-mac-ip>:8001 / http://<your-mac-ip>:8002
VMs/Docker: http://host.docker.internal:8001

Project Structure

mac-speech-services/
├── stt-service/
│   ├── stt_server.py      # Parakeet-MLX STT server
│   └── requirements.txt   # STT dependencies
├── tts-service/
│   ├── kokoro_server.py   # Kokoro-MLX TTS server
│   └── requirements.txt   # TTS dependencies
├── start-stt.sh          # Start STT only
├── start-tts.sh          # Start TTS only
├── start-all.sh          # Start both services
├── LICENSE
└── README.md

Models

Models are downloaded automatically on first use from Hugging Face:

STT: mlx-community/parakeet-tdt_ctc-110m (~115MB)
TTS: mlx-community/Kokoro-82M-bf16 (~400MB)

Cached at ~/.cache/huggingface/ and ~/.cache/kokoro_mlx/.

Troubleshooting

Services won't start

# Check if ports are in use
lsof -ti:8001 8002

# Kill existing processes
pkill -9 -f uvicorn

Models not loading

# Verify MLX installation
python -c "import mlx; print(mlx.__version__)"

# Check Hugging Face cache
ls ~/.cache/huggingface/

Out of memory

STT: ~115MB
TTS: ~400MB
Total: ~520MB (fits comfortably on any Apple Silicon Mac)

Related Projects

mlx - Apple's ML framework
parakeet-mlx - STT models
kokoro-mlx - TTS models

License

MIT License - see LICENSE file.

Contributing

Contributions welcome! Please read CONTRIBUTING.md first.

Acknowledgments

Models by mlx-community on Hugging Face
Parakeet by NVIDIA, adapted for MLX by riedemannai
Kokoro by hexgrad, adapted for MLX by nicholaslor

Memory Management

To prevent memory leaks (Kokoro-MLX v0.1.0 can accumulate memory over time):

# Unload model to free RAM (~350-400MB freed)
curl -X POST http://localhost:8002/unload

# Reload when needed
curl -X POST http://localhost:8002/reload

JSON API

For clients that send JSON instead of form-data:

curl -X POST http://localhost:8002/tts-json \
  -H "Content-Type: application/json" \
  -d '{"text":"Hello","voice":"af_bella","speed":1.0}' \
  --output speech.wav

Memory Watchdog

Auto-restart if memory exceeds 4GB:

./memory_watchdog.sh &

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mac Speech Services

Features

Installation

Usage

Quick Start - Run Both Services

Or Run Individually

Test the Services

API Reference

STT Service (Port 8001)

TTS Service (Port 8002)

System Requirements

Network Access

Project Structure

Models

Troubleshooting

Services won't start

Models not loading

Out of memory

Related Projects

License

Contributing

Acknowledgments

Memory Management

JSON API

Memory Watchdog

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
stt-service		stt-service
tts-service		tts-service
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
memory_watchdog.sh		memory_watchdog.sh
start-all.sh		start-all.sh
start-stt.sh		start-stt.sh
start-tts.sh		start-tts.sh

Folders and files

Latest commit

History

Repository files navigation

Mac Speech Services

Features

Installation

Usage

Quick Start - Run Both Services

Or Run Individually

Test the Services

API Reference

STT Service (Port 8001)

TTS Service (Port 8002)

System Requirements

Network Access

Project Structure

Models

Troubleshooting

Services won't start

Models not loading

Out of memory

Related Projects

License

Contributing

Acknowledgments

Memory Management

JSON API

Memory Watchdog

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages