Skip to content

ArsenalRX/voxmorph

Repository files navigation

VoxMorph

Real-time AI voice translator that converts your speech into another language using a synthetic voice. Speak English and your teammates hear fluent German, Japanese, or any of 15+ supported languages — all in real-time.

Features

  • Real-time voice translation — Speak English, output translated speech instantly
  • 322 AI voices — Pick from Microsoft Edge TTS voices across 74 languages
  • 3 mic modes — Push-to-Talk, Toggle, or Open Mic (auto-detect speech)
  • Live voice switching — Change voice, language, or mic mode on the fly
  • Modern GUI — Dark-themed app with voice preview, key rebinding, and activity log
  • Auto Docker startup — Whisper container starts automatically when you launch the app
  • Subtitle overlay — Optional real-time subtitles for incoming foreign speech

How It Works

Your voice → Whisper AI (transcribe) → Translate → Edge TTS (synthesize) → App mic input
  1. Record your English speech
  2. Whisper AI transcribes it to text
  3. Google Translate (or DeepL) translates to target language
  4. Edge TTS generates speech in your chosen AI voice
  5. Output plays to your app via virtual audio cable

Requirements

  • Windows 10/11
  • Python 3.10+
  • Docker Desktop (for Whisper AI)
  • NVIDIA GPU (recommended for Whisper)
  • VoiceMeeter BananaDownload
  • VB-CABLE Virtual Audio CableDownload

Quick Start

1. Install dependencies

pip install -r requirements.txt

2. Configure audio routing

  1. Set VoiceMeeter Input as your default Windows playback device
  2. In VoiceMeeter Banana, set A1 hardware out to your speakers/headphones
  3. In your target app (Discord, game, etc.):
    • Output → VoiceMeeter Aux Input
    • Input → CABLE Output

3. Set up your .env

cp .env.sample .env

Run python src/modules/get_audio_device_ids.py to find your device IDs, then update .env.

4. Launch

cd src
python app.py

The app will auto-start Docker and load Whisper. Click Start Translator, hold your push-to-talk key, and speak.

Supported Languages

German, Japanese, French, Spanish, Italian, Portuguese, Russian, Chinese, Korean, Hindi, Arabic, Dutch, Polish, Swedish, Turkish — and any language supported by Edge TTS.

Audio Routing Diagram

Your Microphone → VoxMorph app → Whisper → Translate → Edge TTS
                                                          ↓
                                                   [Parallel Output]
                                                   ├→ VoiceMeeter (you hear it)
                                                   └→ VB-CABLE (app hears it)

Tech Stack

  • Whisper AI (faster-whisper) — Speech recognition via Docker
  • Edge TTS — Microsoft neural text-to-speech (free, 322 voices)
  • Google Translate / DeepL — Translation
  • CustomTkinter — Modern GUI
  • PyAudio / SoundDevice — Audio I/O
  • VoiceMeeter + VB-CABLE — Audio routing

Author

Built by ArsenalRX

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors