Skip to content

feat: support custom OpenAI-compatible whisper endpoints#65

Open
toanbot wants to merge 2 commits intosteipete:mainfrom
toanbot:feature/custom-whisper-endpoint
Open

feat: support custom OpenAI-compatible whisper endpoints#65
toanbot wants to merge 2 commits intosteipete:mainfrom
toanbot:feature/custom-whisper-endpoint

Conversation

@toanbot
Copy link

@toanbot toanbot commented Feb 3, 2026

Summary

  1. Custom Whisper Endpoints: Add support for OPENAI_WHISPER_BASE_URL and OPENAI_BASE_URL env vars to allow custom OpenAI-compatible whisper endpoints
  2. OGG/Media Fix: Add missing audio/video extensions (ogg, opus, aiff, wma, mpeg, mpg, avi, wmv, flv) to isDirectMediaUrl regex

Use Cases

  • Custom Whisper: Users with local whisper servers (GPU machines) or alternative providers can use them directly
  • Telegram Voice Messages: OGG files (Telegram's audio format) now route correctly to media transcription

Changes

  • packages/core/src/transcription/whisper/openai.ts: Support custom base URL
  • packages/core/src/content/url.ts: Add missing media extensions

Example Usage

export OPENAI_WHISPER_BASE_URL=http://192.168.1.100:8080/v1
summarize telegram_voice.ogg

Test Plan

  • Custom whisper endpoint with local server
  • OGG audio transcription
  • Default behavior unchanged
  • Built and tested standalone binary

Add support for OPENAI_WHISPER_BASE_URL and OPENAI_BASE_URL environment
variables to allow using custom OpenAI-compatible whisper endpoints for
audio transcription.

This enables users to use self-hosted whisper servers or alternative
providers that implement the OpenAI whisper API specification.

Priority: OPENAI_WHISPER_BASE_URL > OPENAI_BASE_URL > default (api.openai.com)
Add ogg, opus, aiff, wma, mpeg, mpg, avi, wmv, flv to the media URL
detection regex. This fixes OGG audio files (common in Telegram voice
messages) being incorrectly routed through HTML fetcher instead of
media transcription handler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant