This project implements an AI-powered audio transcription service using Cloudflare Workers AI and the Whisper-large-v3-turbo model. The application can transcribe audio files of various lengths by supporting chunk-based processing and leveraging Cloudflare's serverless infrastructure.
- Automatic Speech Recognition (ASR) using OpenAI's Whisper model
- Supports large audio file transcription through intelligent chunking
- Cloudflare Workers deployment for scalable, low-latency transcription
- Configurable transcription parameters:
- Language selection
- Translation vs. transcription mode
- Voice activity detection
- Custom initial prompts
- Cloudflare account
- Node.js (v18+ recommended)
- Wrangler CLI
- Basic JavaScript/TypeScript knowledge
- Clone the repository:
git clone <your-repo-url>
cd whisper-transcription-worker- Install dependencies:
npm install- Configure Cloudflare credentials:
npx wrangler loginUpdate wrangler.toml with:
compatibility_date = "2024-09-23"
nodejs_compat = true
[ai]
binding = "AI"Start the development server:
npx wrangler dev --remoteDeploy your Worker:
npx wrangler deploy// Sample API call configuration
const transcriptionOptions = {
audio: base64EncodedAudio,
task: "transcribe",
language: "en",
vad_filter: "false"
}- MP3
- WAV
- Other formats supported by Cloudflare Workers AI