Skip to content

DylanZhao123/VoiceKeep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

VoiceKeep

Advanced Voice AI Platform with Proprietary Agent Technology

VoiceKeep is a voice AI platform that combines proprietary voice agent technology with ElevenLabs' voice synthesis capabilities. Create, clone, and interact with AI voices through natural conversation.

Key Features

Proprietary Voice Agent Technology

  • Intelligent Voice Agent: Custom-built AI agent that understands context and responds naturally
  • Real-time Voice Processing: Voice analysis and response generation
  • Contextual Conversations: Maintains conversation flow and remembers context
  • Multi-language Support: Voice processing in multiple languages

ElevenLabs Integration

  • Voice Cloning: Create voice replicas using ElevenLabs' models
  • Speech-to-Text: Transcription with multilingual support
  • Text-to-Speech: Voice generation with cloned voices
  • Voice Library Management: Store and manage multiple cloned voices

Seamless Workflow

  1. Record & Clone: Capture your voice and create a digital clone
  2. Generate Speech: Convert any text to speech using your cloned voice
  3. Voice Agent Conversation: Have intelligent conversations with your AI agent
  4. Export & Share: Download audio files and share your creations

Quick Start

Prerequisites

  • Node.js 18+
  • npm or pnpm
  • ElevenLabs API key
  • Supabase account

Installation

# Clone the repository
git clone <repository-url>
cd voicekeep

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env.local

Environment Setup

# .env.local
VITE_SUPABASE_URL=your_supabase_url
VITE_SUPABASE_ANON_KEY=your_supabase_anon_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key

Development

# Start development server
npm run dev

# Build for production
npm run build

Architecture

Frontend (React + TypeScript)

  • Modern UI: Built with React, TypeScript, and Tailwind CSS
  • Component Library: shadcn/ui components for consistent design
  • State Management: React hooks for efficient state handling
  • Audio Processing: Real-time audio recording and playback

Backend (Supabase Functions)

  • Voice Agent Engine: Our proprietary AI processing logic
  • ElevenLabs Integration: Seamless API integration for voice services
  • Session Management: Secure voice agent session handling
  • Audio Processing: Real-time audio analysis and response generation

Voice Agent Technology

┌─────────────────────────────────────────────────────────────┐
│                    VoiceKeep Architecture                   │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────┐    ┌──────────────┐    ┌─────────────┐    │
│  │   React     │◄──►│   Supabase   │◄──►│ ElevenLabs  │    │
│  │  Frontend   │    │   Functions  │    │    API      │    │
│  └─────────────┘    └──────────────┘    └─────────────┘    │
│         │                   │                   │          │
│         ▼                   ▼                   ▼          │
│  ┌─────────────┐    ┌──────────────┐    ┌─────────────┐    │
│  │   Voice     │◄──►│   Our AI     │◄──►│   Voice     │    │
│  │ Recording   │    │   Agent      │    │ Synthesis   │    │
│  └─────────────┘    └──────────────┘    └─────────────┘    │
└─────────────────────────────────────────────────────────────┘

Core Components

Voice Recording & Cloning

  • High-quality audio recording with real-time feedback
  • Automatic voice cloning using ElevenLabs API
  • Multi-language voice input support
  • Voice quality optimization

Text-to-Speech Generation

  • Convert any text to speech using cloned voices
  • Multiple voice models and settings
  • Audio file export and sharing
  • Real-time voice preview

API Integration

ElevenLabs Services

  • Voice Cloning: Create voice replicas from audio samples
  • Speech-to-Text: Convert speech to text with high accuracy
  • Text-to-Speech: Generate natural-sounding speech
  • Voice Library: Manage and organize cloned voices

Our Proprietary Technology

  • AI Agent Processing: Custom AI logic for intelligent responses
  • Voice Analysis: Advanced voice pattern recognition
  • Context Management: Maintains conversation state
  • Response Generation: Natural language response creation

User Interface

Modern Design

  • Clean, intuitive interface with dark/light themes
  • Responsive design for all devices
  • Real-time audio visualization
  • Smooth animations and transitions

User Experience

  • One-Click Recording: Simple voice capture process
  • Instant Cloning: Fast voice cloning with progress indicators
  • Natural Conversations: Seamless voice agent interactions
  • Audio Controls: Play, pause, and download audio files

Development

Tech Stack

  • Frontend: React 18, TypeScript, Vite
  • Styling: Tailwind CSS, shadcn/ui
  • Backend: Supabase Functions (Deno)
  • Voice AI: ElevenLabs API
  • Database: Supabase PostgreSQL

Key Files

src/
├── components/
│   ├── VoiceAgentInterface.tsx    # Voice agent UI
│   ├── RecordingControls.tsx      # Audio recording
│   └── ui/                        # UI components
├── services/
│   ├── voiceAgent.ts              # Our AI agent logic
│   └── elevenlabs.ts              # ElevenLabs integration
├── pages/
│   ├── VoiceAgentPage.tsx         # Voice agent page
│   ├── ClonePage.tsx              # Voice cloning
│   └── RecordPage.tsx             # Voice recording
└── hooks/
    └── useAudioRecorder.ts        # Audio recording hook

Deployment

Production Build

npm run build
npm run preview

Environment Variables

  • VITE_SUPABASE_URL: Supabase project URL
  • VITE_SUPABASE_ANON_KEY: Supabase anonymous key
  • ELEVENLABS_API_KEY: ElevenLabs API key

Supabase Functions

Deploy the voice agent functions to Supabase:

  • voice-agent-init: Initialize voice agent sessions
  • voice-agent-process: Process voice input
  • voice-agent-respond: Generate voice responses
  • voice-agent-end: Cleanup sessions

Security & Privacy

  • Secure API Keys: All API keys stored securely in environment variables
  • Session Management: Secure voice agent session handling
  • Data Privacy: Voice data processed securely and not stored permanently
  • Authentication: Secure user authentication and authorization

Performance

  • Real-time Processing: Low-latency voice processing
  • Optimized Audio: Efficient audio compression and streaming
  • Caching: Smart caching for improved performance
  • Responsive UI: Smooth animations and interactions

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For support and questions:

  • Create an issue on GitHub
  • Check the documentation
  • Contact the development team

VoiceKeep - Where proprietary AI meets cutting-edge voice technology.

About

Voicekeep allows you to keep the voice of someone you love with existing voice samples, integrating Elenvenlabs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages