Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!
-
Updated
Apr 27, 2022 - Jupyter Notebook
Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!
CEP is a software platform designed for users that want to learn or rapidly prototype using standard A.I. components.
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
It includes papers on speech&audio field. Now update: ICLR2023-2025, ICML2023-2024, NeurIPS2023-2024, ACMMM2024, AAAI2024, ACL2024, EMNLP2024, NAACL2025, AAAI2025, IJCAI2024
Self-supervised Generative LM-based Voice Conversion
Streamlit frontend for Coqui-tts
A conversational speech model (CSM) that generates natural-sounding speech with context awareness and consistent audio quality. Supports multi-speaker conversations and maintains contextual understanding across turns, ensuring consistent audio output throughout the conversation.
Text to Speech generator. Supports multiple accents.
A simple Discord bot that synthesizes speech directly to a voice channel via text commands with support for sound effects.
PyOrator: A Python-based Speech Generator
An autonomous AI agent for real-time information retrieval and speech generation, leveraging LLMs, RAG, and multi-agent collaboration.
Add a description, image, and links to the speech-generation topic page so that developers can more easily learn about it.
To associate your repository with the speech-generation topic, visit your repo's landing page and select "manage topics."