This repo has been retired and is no longer being worked on. I recommend F5-TTS as a suitable alternative. If you wish to use the code below, you can stil fork the repo/download the files locally. This repository is no longer updated or maintained - the code should not be used in production settings until it has been sanity checked for any issues.
This is a Python implementation for the MARS5 TTS repo that allows you to clone a voice with a command line interface.
Try the online demo here for a quickstart or follow the instructions below to run this project locally/offline
- EZ front end
- Enable PyTorch CUDA (supported devices only)
Install Python
Clone directory/download .zip file
Open CMD/Powershell in directory and run
pip install -r requirements.txt
Download/clone this repo and open the folder in Visual Studio Code
Find your Terminal and run
python tts.py # where tts.py is the file name of the Python script
You'll be taken through the steps to start cloning - fill in your own values or press ENTER for default.
Download/clone this repo and open Command Line/PowerShell from the folder (File -> Open PowerShell Window)
There's no impact on us whether we use Command Line or PowerShell. They look the same in terms of input/output.
python tts.py
- Keep reference voice between 6-10 seconds
- Tune config to get optimal output
- Provide transcript when possible
- Use sample provided for ease
- System won't lock up but CPU usage remains at 100% throughout processing stage
- Results will vary - this is not a one shot model but seems far more impressive than ElevenLabs, Speechify at the cost of processing time