Skip to content

A client to transcribe audio using the OpenAI Whisper API

License

Notifications You must be signed in to change notification settings

Sander-HR/whisper-webui

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper Web UI

This project provides both a Streamlit web application (whisper_webui.py) and a command-line interface (whisper_cli.py) for transcribing audio files using the Whisper Large v3 model via either the OpenAI or Groq API. It offers a user-friendly interface for uploading audio, processing it, and obtaining transcriptions quickly and efficiently.

Screenshot_003781

Features

  • Automatic compression for files larger than 25MB
  • Support for multiple audio formats (mp3, mp4, mpeg, mpga, m4a, wav, webm)
  • Transcription using Whisper Large v3 model through OpenAI, Groq, or Fal API
  • Display of transcription time and results
  • Option to copy transcript to clipboard
  • Ability to save transcript to a file
  • Both web-based and command-line interfaces

Installation

  1. Clone this repository:

    git clone https://github.com/yourusername/audio-transcription-app.git
    cd audio-transcription-app
    
  2. Install the required dependencies:

    pip install streamlit groq openai pydub pyperclip
    
  3. Set up your API keys as environment variables:

    export OPENAI_API_KEY='your_openai_api_key_here'
    export GROQ_API_KEY='your_groq_api_key_here'
    

Usage

Streamlit Web Application (whisper_webui.py)

  1. Run the Streamlit app:

    streamlit run whisper_webui.py
    
  2. Open your web browser and navigate to the provided local URL (typically http://localhost:8501).

  3. Use the interface to upload an audio file, process it, and view the transcription results.

Command-Line Interface (whisper_cli.py)

The CLI version offers more flexibility and options for transcription. Here's how to use it:

python whisper_cli.py [-h] -i INPUT [-o OUTPUT] [--compress-only] [-c] [--api {openai,groq}]

Options:

  • -i INPUT, --input INPUT: Input audio file (required)
  • -o OUTPUT, --output OUTPUT: Output filename for the transcript
  • --compress-only: Compress the audio file only (no transcription)
  • -c, --clipboard: Copy the transcription text to the system clipboard
  • --api {openai,groq}: Choose API for transcription (default: openai)

Examples:

  1. Transcribe an audio file using OpenAI API:

    python whisper_cli.py -i input.mp3 -o transcript.txt
    
  2. Transcribe using Groq API and copy to clipboard:

    python whisper_cli.py -i input.wav --api groq -c
    
  3. Compress an audio file without transcribing:

    python whisper_cli.py -i large_file.mp3 --compress-only
    

Note on Fal API Usage

If using the Fal.ai API for transcription, the application uploads your audio file to tmpfiles.org. This step is necessary because the Fal API requires input files to be accessible via a public URL.

Please ensure that you have the necessary rights to upload and make your audio public. Do not use this method for sensitive recordings.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is released under the MIT License.

About

A client to transcribe audio using the OpenAI Whisper API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%