A production-ready, full-stack audio transcription platform that leverages Faster Whisper (faster-whisper) to provide high-quality speech-to-text conversion. Built with modern technologies including Spring Boot, FastAPI, and React TypeScript. Faster Whisper is up to 4x faster than OpenAI's Whisper with less memory usage, powered by CTranslate2.
Whisperrr transforms audio content into accurate, searchable text using state-of-the-art AI technology. Upload a file and get instant transcription results - no database setup, no job queuing, no polling required.
- Instant Transcription: Upload and get results immediately
- High Accuracy: Powered by Faster Whisper AI models (tiny to large-v3)
- Fast Performance: Up to 4x faster than OpenAI Whisper with less memory usage
- Multi-Language: Support for 99+ languages with automatic detection
- Multiple Formats: MP3, WAV, M4A, FLAC, OGG, WMA (up to 50MB)
- Segment-Level Timestamping: View transcription results with precise start and end timestamps for each segment
- Stateless Architecture: No database required - simplified deployment
- Modern UI: Responsive React interface with drag-and-drop upload
- Production Ready: Comprehensive error handling and monitoring
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββ
β React Frontend βββββΊβ Spring Boot API βββββΊβ Python β
β (Port 3737) β β (Port 7331) β β Service β
β β β β β (Port 5001) β
β β’ File Upload β β β’ Validation β β β’ Whisper AIβ
β β’ Results View β β β’ Proxy/Relay β β β’ Processingβ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββ
- React Frontend: User interface with drag-and-drop file upload
- Spring Boot API: Lightweight proxy for validation and error handling
- Python Service: AI-powered transcription using Faster Whisper models (CTranslate2)
- Docker - For running the application
- Docker Compose v2 - For orchestrating multiple services
# Clone the repository
git clone <repository-url>
cd Whisperrr
# Start all services with Docker Compose
docker compose up -d
# Access the application
# Frontend: http://localhost:3737
# Backend API: http://localhost:7331
# Python Service: http://localhost:5001docker compose logs -fdocker compose downIf you prefer to run the services locally without Docker, follow these steps:
Before starting, ensure you have the following installed:
- Java JDK 21 - Required for Spring Boot backend
- Maven 3.6+ - For building Java backend (or use included
mvnw) - Node.js 18+ and npm - For React frontend
- Python 3.12 - For FastAPI transcription service (specific version required)
- FFmpeg - For audio processing (required by Python service)
π Need help checking versions or installing prerequisites? See the Prerequisites Guide for detailed instructions.
If all services run on localhost with default ports, no environment variable configuration is needed. Simply start each service:
Start services in separate terminals:
-
Terminal 1 - Python Service:
cd python-service python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt python3 -m uvicorn app.main:app --host 0.0.0.0 --port 5001
-
Terminal 2 - Backend Service:
cd backend ./mvnw spring-boot:run -
Terminal 3 - Frontend Service:
cd frontend npm install npm start
Service URLs (localhost):
- Frontend: http://localhost:3737
- Backend API: http://localhost:7331
- Python Service: http://localhost:5001
If services run on different hosts or custom ports, use the setup script to configure environment variables:
# Run the interactive setup script
./setup-env.sh
# Each service automatically reads its .env file at startup
# No need to source any files - just restart services after running setup-env.sh
# Then start services as described aboveFor Remote Deployment: Use production mode for the frontend:
cd frontend
npm run build
npx serve -s build -l 3737Note: The setup script checks prerequisites and configures all necessary environment variables. It supports both:
- Simple mode (default): Single host configuration with HTTP (for local development)
- Remote deployment mode: Remote URL configuration with HTTPS (for production/remote deployment)
For detailed setup instructions including remote deployment mode, see the Quick Start Guide.
- Frontend: http://localhost:3737
- Backend API: http://localhost:7331/api/audio/health
- Python Service: http://localhost:5001/health
- Python API Docs: http://localhost:5001/docs
- Model download fails: Check internet connection. Models are downloaded from Hugging Face on first run.
- Python version error: Ensure Python 3.12 is installed (specific version required). Check with
python3 --version. See Prerequisites Guide for installation help. - FFmpeg not found: Install FFmpeg:
- macOS:
brew install ffmpeg - Ubuntu/Debian:
sudo apt-get install ffmpeg - Windows: Download from ffmpeg.org
- See Prerequisites Guide for detailed instructions.
- macOS:
- Port 7331 already in use: Change port in
backend/src/main/resources/application.properties - Java version error: Ensure Java JDK 21 is installed:
java -version. Themvnwwrapper requires Java JDK to run. - mvnw not working: Make sure Java JDK 21 is installed and in your PATH. See Prerequisites Guide for installation help.
- Port 3737 already in use: Change port in
frontend/package.jsonscripts section - npm install fails: Try clearing cache:
npm cache clean --force - Node/npm version issues: Ensure you have the correct Node.js and npm versions. See Prerequisites Guide for version requirements and installation.
- Frontend calling localhost instead of configured URL:
- Verify
frontend/.envfile exists and containsREACT_APP_API_URL - IMPORTANT: React reads environment variables only at dev server start time
- You must restart the dev server after creating/updating
frontend/.envfile - Stop the dev server (Ctrl+C) and run
npm startagain - Check browser console for API configuration debug messages
- Verify
Whisperrr/
βββ frontend/ # React TypeScript Frontend
βββ backend/ # Spring Boot API Proxy
βββ python-service/ # FastAPI Transcription Service
βββ docs/ # Documentation
βββ docker-compose.yml # Docker Compose configuration
For localhost development (default ports), no environment variable configuration is needed.
For remote development or custom ports, use the setup script:
./setup-env.sh
# Script automatically creates .env files for each service
# Restart services after running setup-env.sh to apply changesThe setup script automatically configures all required environment variables. It supports:
- Simple mode: Single host configuration with HTTP (default, for local development)
- Remote deployment mode: Remote URL configuration with HTTPS (for production/remote deployment)
For detailed information about environment variables, remote deployment configuration, and advanced setup, see the Quick Start Guide and Configuration Guide.
server.port=7331
whisperrr.service.url=http://localhost:5001
cors.allowed-origins=http://localhost:3737,http://localhost:3738
spring.servlet.multipart.max-file-size=50MBDefault configuration (can be overridden via environment variables):
- Model size:
base(tiny, base, small, medium, large, large-v2, large-v3) - Max file size:
50MB - CORS origins:
http://localhost:7331,http://localhost:3737 - Log level:
INFO
Default configuration (can be overridden via environment variables):
- Max file size:
50MB - API URL:
http://localhost:7331/api
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/audio/transcribe |
Upload and transcribe audio file |
GET |
/api/audio/health |
Service health check |
| Method | Endpoint | Description |
|---|---|---|
POST |
/transcribe |
Direct audio transcription |
GET |
/health |
Service health and model status |
GET |
/model/info |
Current model information |
Interactive API Documentation: http://localhost:5001/docs
# Upload and transcribe audio file
curl -X POST http://localhost:7331/api/audio/transcribe \
-F "audioFile=@recording.mp3"- Start the Application: Run
docker compose up -d - Open Browser: Navigate to
http://localhost:3737 - Upload Audio: Drag and drop or select an audio file
- Get Results: View transcription results immediately
| Model | Size | Speed | Accuracy | Best For |
|---|---|---|---|---|
tiny |
39 MB | ~32x realtime | Basic | Quick drafts |
base |
74 MB | ~16x realtime | Good | General use (default) |
small |
244 MB | ~6x realtime | Better | Balanced quality/speed |
medium |
769 MB | ~2x realtime | High | Important content |
large |
1550 MB | ~1x realtime | Highest | Maximum accuracy |
large-v2 |
1550 MB | ~1x realtime | Highest | Latest large model |
large-v3 |
1550 MB | ~1x realtime | Highest | Latest large model |
Performance Note: Faster Whisper is up to 4x faster than OpenAI Whisper with the same accuracy, using less memory. It uses CTranslate2 for optimized inference on both CPU and GPU.
# Backend tests
cd backend && ./mvnw test
# Frontend tests
cd frontend && npm test
# Python service tests
cd python-service && python -m pytest# Backend formatting
cd backend && ./mvnw spotless:apply
# Frontend linting
cd frontend && npm run lint
# Python formatting
cd python-service && black app/# Rebuild and start services
docker compose up -d --build
# Check service logs
docker compose logs -f
# Verify services are running
docker compose ps- Verify frontend URL is in
cors.allowed-originsin backend configuration - Check both services are running on correct ports
- Check Python service health:
curl http://localhost:5001/health - Verify model is loaded:
curl http://localhost:5001/model/info - Check available system resources
- Verify file size is under 50MB
- Check file format is supported (MP3, WAV, M4A, FLAC, OGG, WMA)
- Review backend logs for specific errors
- Prerequisites - Check versions and install required software (Java, Python, Node.js)
- Quick Start Guide - Complete setup guide including local and remote development
- Architecture - Technical architecture guide
- Configuration - Configuration guide
- Codebase Guide - Developer guide
- Documentation Index - Complete documentation index
- LICENSE - MIT License
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- SYSTRAN for the Faster Whisper implementation
- OpenAI for the original Whisper models
- CTranslate2 for the fast inference engine
- Spring Boot Team for the Java framework
- FastAPI Team for the Python web framework
- React Team for the frontend library
Built with β€οΈ for simplicity and instant results