If architecture is different, could be its own repository. Just thought I'd mention that this open source TTS is pretty cool. https://github.com/SWivid/F5-TTS. It is also designed to be efficient already, so it can be improved even further if switched to a ggml model. Someone already did an mlx implementation: https://github.com/lucasnewman/f5-tts-mlx.