Skip to content

daisyfaithauma/whisper-tutorial

Repository files navigation

Whisper Transcription Cloudflare Worker

Project Overview

This project implements an AI-powered audio transcription service using Cloudflare Workers AI and the Whisper-large-v3-turbo model. The application can transcribe audio files of various lengths by supporting chunk-based processing and leveraging Cloudflare's serverless infrastructure.

Features

  • Automatic Speech Recognition (ASR) using OpenAI's Whisper model
  • Supports large audio file transcription through intelligent chunking
  • Cloudflare Workers deployment for scalable, low-latency transcription
  • Configurable transcription parameters:
    • Language selection
    • Translation vs. transcription mode
    • Voice activity detection
    • Custom initial prompts

Prerequisites

  • Cloudflare account
  • Node.js (v18+ recommended)
  • Wrangler CLI
  • Basic JavaScript/TypeScript knowledge

Installation

  1. Clone the repository:
git clone <your-repo-url>
cd whisper-transcription-worker
  1. Install dependencies:
npm install
  1. Configure Cloudflare credentials:
npx wrangler login

Configuration

Update wrangler.toml with:

compatibility_date = "2024-09-23"
nodejs_compat = true

[ai]
binding = "AI"

Local Development

Start the development server:

npx wrangler dev --remote

Deployment

Deploy your Worker:

npx wrangler deploy

Usage Example

// Sample API call configuration
const transcriptionOptions = {
  audio: base64EncodedAudio,
  task: "transcribe",
  language: "en",
  vad_filter: "false"
}

Supported Audio Formats

  • MP3
  • WAV
  • Other formats supported by Cloudflare Workers AI

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors