Skip to content

kohi9noor/_vox

Repository files navigation

VOX

An open-source, sophisticated multi-model AI audio generation platform

Next.js TypeScript Nx License: MIT


VOX Demo Banner

Watch the Demo


Integrating state-of-the-art voice conversion, SFX generation, and text-to-audio models into a seamless, high-fidelity experience.


Overview

VOX is a modular open-source AI audio platform that brings together state-of-the-art models for:

  • Voice conversion & cloning
  • Multilingual text-to-speech
  • Text-to-audio & sound effects generation

Quick Start

One command sets up everything — environments, model weights, dependencies, and database:

chmod +x init.sh
./init.sh

Tech Stack

Frontend

  • Next.js 15 (App Router)
  • TypeScript
  • Tailwind CSS
  • Zustand
  • Tanstack Query

Backend

  • Node.js 20+
  • Drizzle ORM
  • p-queue

AI Models

  • Seed-VC — Zero-shot voice conversion & cloning
  • Make-An-Audio — Text-to-audio generation
  • XTTS-v2 — High-quality multilingual TTS

Automation

  • Bash orchestration
  • Python-based environment & model manager

Project Structure

├── packages/
│   ├── app/          # Next.js frontend
│   └── server/       # Backend API & database
├── models/
│   ├── seed-vc/      # Voice conversion
│   ├── make-an-audio/# Audio generation
│   └── xtts-v2/      # Text-to-speech
├── data/             # Audio assets & outputs
└── init.sh           # One-command setup

System Requirements

  • OS: macOS (MPS) or Linux (CUDA)
  • Python: 3.10+
  • Node.js: 20+
  • GPU: Recommended (CPU supported with reduced performance)

License

MIT — free to use, modify, and distribute.

About

An open-source, sophisticated multi-model AI audio generation platform

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors