feat: add audio transcription functionality by viniciusventura29 · Pull Request #2268 · decocms/mesh

viniciusventura29 · 2026-01-22T13:06:15Z

What is this contribution about?

Adds voice input capabilities to the chat interface with automatic speech-to-text transcription.
Changes

New API endpoint

POST /:org/transcribe - Accepts audio files or URLs and returns transcribed text

New bindings

TRANSCRIPTION_BINDING - For speech-to-text providers
OBJECT_STORAGE_BINDING - For temporary audio file storage

Frontend

useAudioRecorder hook for browser microphone capture
Chat input now supports voice recording

How it works

1 - User records audio in the browser
2 - Audio is uploaded to Object Storage (temp file)
3 - Transcription service processes the audio URL
4 - Temp file is cleaned up automatically
5 - Transcribed text is returned to the chat

Requirements

Connections implementing:
TRANSCRIPTION_BINDING (e.g., OpenAI Whisper, Deepgram)
OBJECT_STORAGE_BINDING (e.g., S3, R2, GCS) - only needed for file uploads

Screenshots/Demonstration

https://www.loom.com/share/2299fc0160364c6ead724c8d8925d04d

Review Checklist

PR title is clear and descriptive
Changes are tested and working
Documentation is updated (if needed)
No breaking changes

Summary by cubic

Adds voice input to chat with an audio recorder and a new transcription API. Audio is sent as a blob or public URL, transcribed via a TRANSCRIPTION binding, and the text is inserted into the chat input.

New Features
- New POST /api/:org/transcribe route that accepts an audio file or URL, validates up to 25MB and blocks localhost/private IPs, and calls TRANSCRIBE_AUDIO via the TRANSCRIPTION binding.
- Uploaded blobs are converted to base64 server-side for direct transcription (no object storage needed).
- Chat input gets a mic button with recording/transcribing states; sends recorded audio to the API and appends the transcribed text. Button is enabled only when a TRANSCRIPTION provider is available.
- New useAudioRecorder hook and transcription binding types/schemas (supported formats: webm, mp3/mpeg, mp4/m4a, wav, ogg, flac, video/webm).
Migration
- Connect at least one provider implementing TRANSCRIPTION; the mic button appears only when available.
- Ensure browser mic permissions; recordings up to ~3 minutes are supported.

^{Written for commit 046cb4b. Summary will update on new commits.}

- Introduced a new transcription API route to handle audio-to-text conversion. - Implemented audio recording capabilities in the chat input component, allowing users to record and transcribe audio messages. - Added hooks for audio recording management and binding detection for transcription and object storage. - Updated the chat context to include binding availability for transcription services. - Enhanced the UI to show recording options based on available bindings.

github-actions · 2026-01-22T13:06:23Z

🧪 Benchmark

Should we run the MCP Gateway benchmark for this PR?

React with 👍 to run the benchmark.

Reaction	Action
👍	Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

github-actions · 2026-01-22T13:06:25Z

Release Options

Should a new version be published when this PR is merged?

React with an emoji to vote on the release type:

Reaction	Type	Next Version
👍	Prerelease	`2.28.1-alpha.1`
🎉	Patch	`2.28.1`
❤️	Minor	`2.29.0`
🚀	Major	`3.0.0`

Current version: 2.28.0

Deployment

Deploy to production (triggers ArgoCD sync after Docker image is published)

…t-transcription

cubic-dev-ai

4 issues found across 8 files

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="apps/mesh/src/web/hooks/use-audio-recorder.ts">

<violation number="1" location="apps/mesh/src/web/hooks/use-audio-recorder.ts:245">
P2: stopRecording gates on React state, which can be stale right after startRecording. This can cause stopRecording to return null and leave the MediaRecorder running. Check the recorder's state instead of `isRecording`.</violation>
</file>

<file name="packages/bindings/src/well-known/transcription.ts">

<violation number="1" location="packages/bindings/src/well-known/transcription.ts:37">
P2: TranscriptionInputSchema allows an empty object (both `audio` and `audioUrl` are optional), so callers can submit no audio source. Enforce that at least one of `audio` or `audioUrl` is provided to prevent invalid requests from passing validation.</violation>
</file>

<file name="apps/mesh/src/api/routes/transcribe.ts">

<violation number="1" location="apps/mesh/src/api/routes/transcribe.ts:317">
P1: Validate `audioUrl` (scheme/host) before passing it to the transcription service to avoid SSRF or non-HTTP URLs being processed.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/input.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/input.tsx:286">
P2: The stop flow leaves the button enabled until after stopRecording resolves. A second click during this window can call stopRecording twice and overwrite the stored resolver, leaving the first await unresolved. Set the “transcribing/stopping” state before awaiting stopRecording (or disable the button while stopping) to prevent duplicate stop calls.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

apps/mesh/src/api/routes/transcribe.ts

apps/mesh/src/web/hooks/use-audio-recorder.ts

packages/bindings/src/well-known/transcription.ts

apps/mesh/src/web/components/chat/input.tsx

…essages - Renamed and consolidated functions for finding connections with specific bindings to enhance code clarity and reusability. - Updated error messages in the ChatInput component to provide clearer feedback to users regarding audio recording and transcription failures. - Improved UI text for better user experience during audio recording and transcription processes.

- Added a new function to validate audio URLs, ensuring only HTTP/HTTPS URLs with public hosts are accepted. - Updated the transcription API route to validate the audioUrl parameter before processing. - Enhanced the TranscriptionInputSchema to enforce the requirement of either 'audio' or 'audioUrl' for transcription requests. - Improved the audio recorder hook to check the actual state of the media recorder before stopping it.

cubic-dev-ai

1 issue found across 3 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="apps/mesh/src/api/routes/transcribe.ts">

<violation number="1" location="apps/mesh/src/api/routes/transcribe.ts:50">
P1: SSRF validation only checks hostname strings and misses DNS rebinding/private IP resolution. A public hostname that resolves to a private/link-local IP will pass validation and still allow SSRF.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

apps/mesh/src/api/routes/transcribe.ts

- Added a function to check if an IP address is private, improving the validation of audio URLs. - Updated the validateAudioUrl function to resolve DNS and ensure that URLs do not resolve to private or internal IP addresses. - Modified the transcription API route to await the validation of audioUrl, ensuring proper error handling for invalid URLs.

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="apps/mesh/src/api/routes/transcribe.ts">

<violation number="1" location="apps/mesh/src/api/routes/transcribe.ts:51">
P1: Extend the IPv6 checks to reject IPv4‑mapped IPv6 addresses (e.g., `::ffff:127.0.0.1`). Otherwise an attacker can bypass the SSRF filter by using IPv4‑mapped private IPs.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

apps/mesh/src/api/routes/transcribe.ts

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

…t-transcription

…admin into feat/chat-transcription

- Updated the isPrivateIp function to safely handle undefined values when checking IPv4-mapped addresses, ensuring robust validation of IP addresses.

…t-transcription

…e preparation logic

…L, removing object storage dependency. Update chat context to eliminate object storage binding checks.

cubic-dev-ai

1 issue found across 3 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="apps/mesh/src/api/routes/transcribe.ts">

<violation number="1" location="apps/mesh/src/api/routes/transcribe.ts:249">
P2: `finalAudioUrl` is being set to a base64 data URL and then passed as `audioUrl`. The binding defines `audio` for base64 data and `audioUrl` for fetchable URLs, so this risks provider incompatibility for file uploads. Pass base64 via the `audio` field instead of `audioUrl` when using inline data.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

apps/mesh/src/api/routes/transcribe.ts

…create new document if empty. Improved handling of last paragraph content for seamless integration of transcriptions.

…string instead of data URL. Update related comments and error handling for improved clarity.

…ok to eliminate unnecessary dependency.

Merge branch 'main' of https://github.com/decocms/admin into feat/cha…

1119e9b

…t-transcription

cubic-dev-ai bot reviewed Jan 22, 2026

View reviewed changes

viniciusventura29 added 2 commits January 22, 2026 10:26

cubic-dev-ai bot reviewed Jan 22, 2026

View reviewed changes

apps/mesh/src/api/routes/transcribe.ts Show resolved Hide resolved

cubic-dev-ai bot reviewed Jan 22, 2026

View reviewed changes

apps/mesh/src/api/routes/transcribe.ts Show resolved Hide resolved

viniciusventura29 and others added 8 commits January 22, 2026 12:05

Update apps/mesh/src/api/routes/transcribe.ts

b9bbe4d

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

Merge branch 'main' of https://github.com/decocms/admin into feat/cha…

7583bb9

…t-transcription

Merge branch 'feat/chat-transcription' of https://github.com/decocms/…

9ce27de

…admin into feat/chat-transcription

fix: handle potential undefined value in IP address validation

74f27a9

- Updated the isPrivateIp function to safely handle undefined values when checking IPv4-mapped addresses, ensuring robust validation of IP addresses.

Merge branch 'main' of https://github.com/decocms/admin into feat/cha…

344fd9a

…t-transcription

Merge branch 'main' of https://github.com/decocms/admin into feat/cha…

57e4104

…t-transcription

Refactor chat context to use new connections hook and simplify messag…

9829d2c

…e preparation logic

Refactor audio processing in transcription to convert Blob to data UR…

71f2291

…L, removing object storage dependency. Update chat context to eliminate object storage binding checks.

cubic-dev-ai bot reviewed Jan 23, 2026

View reviewed changes

apps/mesh/src/api/routes/transcribe.ts Outdated Show resolved Hide resolved

viniciusventura29 added 3 commits January 23, 2026 19:53

Enhance chat input to append transcribed text to existing content or …

c5549d6

…create new document if empty. Improved handling of last paragraph content for seamless integration of transcriptions.

Refactor audio processing in transcription to convert Blob to base64 …

0f60ab5

…string instead of data URL. Update related comments and error handling for improved clarity.

Remove OBJECT_STORAGE_BINDING from BUILTIN_BINDINGS in use-binding ho…

046cb4b

…ok to eliminate unnecessary dependency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add audio transcription functionality#2268

feat: add audio transcription functionality#2268
viniciusventura29 wants to merge 16 commits intomainfrom
feat/chat-transcription

viniciusventura29 commented Jan 22, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

viniciusventura29 commented Jan 22, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is this contribution about?

New API endpoint

New bindings

Frontend

How it works

Requirements

Screenshots/Demonstration

Review Checklist

Summary by cubic

Uh oh!

github-actions bot commented Jan 22, 2026

🧪 Benchmark

Uh oh!

github-actions bot commented Jan 22, 2026

Release Options

Deployment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

viniciusventura29 commented Jan 22, 2026 •

edited by cubic-dev-ai bot

Loading