SpeechTTModels is a speech-to-text conversion tool designed to transcribe audio files into text. It uses state-of-the-art models for automatic speech recognition (ASR) to convert various audio formats (such as MP3, WAV, etc.) into readable text.
- Audio Transcription: Converts audio files into accurate text.
- Multiple Audio Formats Supported: Supports various audio file formats like MP3, WAV, FLAC, and more.
- Fast and Reliable: Uses advanced ASR models for high-quality transcriptions.
- Customizable Output: Allows saving the transcribed text in different formats (e.g., plain text, JSON).
- Clone the repository:
git clone https://github.com/Fluentez/SpeechTTModels.git
cd SpeechTTModels- Install library
pip install vosk