feat: direct browser TTS calls via js-tts-wrapper#89
Conversation
Replace API-route-proxied TTS with direct browser calls using js-tts-wrapper's AzureTTSClient and ElevenLabsTTSClient. The library now has browser support so we can call synthesize/getVoices directly without the /api/tts proxy. Delete app/api/tts/speak and app/api/tts/voices routes. Closes #88
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughRemoved two server-side TTS API routes and refactored client-side TTS to use a local js-tts-wrapper client via new helpers in Changes
Sequence Diagram(s)sequenceDiagram
participant UI as Browser UI
participant Lib as lib/tts.ts
participant Client as js-tts-wrapper Client
participant Provider as External TTS Provider
UI->>Lib: fetchTTSBlob(text, config)
Lib->>Client: createClient(config) / synthToBytes(text)
Client->>Provider: HTTP TTS request (ElevenLabs/Azure)
Provider-->>Client: audio bytes
Client-->>Lib: audio bytes
Lib-->>UI: Blob (audio/mpeg)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
lib/tts.ts (1)
17-26: Avoid applying voice selection during voice catalog fetch.
fetchVoices()only needs provider credentials, but lines 20 and 24 applyconfig.voiceIdfor both providers. When switching providers, the persistedsettings.voiceIdfrom the previous provider is passed to the new client beforegetVoices()runs, creating unnecessary coupling between voice lookups and selection state.Voice selection should be opt-in for synthesis only. Only
fetchTTSBlob()(line 29) should apply the voice ID;fetchVoices()(line 35) should fetch without it.♻️ Suggested refactor
-function createClient(config: TTSConfig) { +function createClient(config: TTSConfig, applyVoice = false) { if (config.provider === 'azure') { const client = new AzureTTSClient({ subscriptionKey: config.subscriptionKey, region: config.region }); - if (config.voiceId) client.setVoice(config.voiceId); + if (applyVoice && config.voiceId) client.setVoice(config.voiceId); return client; } const client = new ElevenLabsTTSClient({ apiKey: config.apiKey }); - if (config.voiceId) client.setVoice(config.voiceId); + if (applyVoice && config.voiceId) client.setVoice(config.voiceId); return client; } @@ - const client = createClient(config); + const client = createClient(config, true); @@ const client = createClient(config);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/tts.ts` around lines 17 - 26, The createClient function is applying config.voiceId to new provider clients, coupling voice selection to voice catalog fetch; remove the voiceId application from createClient so it returns clients created only with credentials (AzureTTSClient in createClient and ElevenLabsTTSClient in createClient). Instead, apply voice selection only when performing synthesis in fetchTTSBlob (call setVoice(config.voiceId) on the client returned for use in fetchTTSBlob), while fetchVoices should call createClient and call getVoices() without setting any voiceId. Update references to createClient, fetchVoices, and fetchTTSBlob accordingly so voice selection is opt-in for synthesis only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/tts.ts`:
- Around line 29-31: The Blob construction uses bytes.buffer which can include
data outside the Uint8Array view; change the return to pass the returned byte
view directly into Blob (i.e., use new Blob([bytes]) instead of new
Blob([bytes.buffer])) after obtaining the bytes from client.synthToBytes(text)
in the function that calls createClient and synthToBytes so sliced views keep
correct byteOffset/byteLength.
In `@package.json`:
- Line 19: The package.json currently depends on a non-portable sibling path
"js-tts-wrapper" via "file:../js-tts-wrapper"; replace that entry for the
"js-tts-wrapper" dependency with a portable spec (for example a published semver
version, a git URL, or a tarball URL / temporary fork) so installs work on clean
checkouts/CI; update the "js-tts-wrapper" value in package.json accordingly and
run npm install / npm run build locally to verify resolution.
---
Nitpick comments:
In `@lib/tts.ts`:
- Around line 17-26: The createClient function is applying config.voiceId to new
provider clients, coupling voice selection to voice catalog fetch; remove the
voiceId application from createClient so it returns clients created only with
credentials (AzureTTSClient in createClient and ElevenLabsTTSClient in
createClient). Instead, apply voice selection only when performing synthesis in
fetchTTSBlob (call setVoice(config.voiceId) on the client returned for use in
fetchTTSBlob), while fetchVoices should call createClient and call getVoices()
without setting any voiceId. Update references to createClient, fetchVoices, and
fetchTTSBlob accordingly so voice selection is opt-in for synthesis only.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c3852e7c-5b23-4253-ac67-39567fee1a9e
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (4)
app/api/tts/speak/route.tsapp/api/tts/voices/route.tslib/tts.tspackage.json
💤 Files with no reviewable changes (2)
- app/api/tts/voices/route.ts
- app/api/tts/speak/route.ts
| const client = createClient(config); | ||
| const bytes = await client.synthToBytes(text); | ||
| return new Blob([bytes.buffer as ArrayBuffer]); |
There was a problem hiding this comment.
Pass the returned byte view directly into Blob.
Line 31 uses bytes.buffer, which ignores byteOffset and byteLength. If synthToBytes() returns a sliced Uint8Array/Buffer, the blob will include bytes outside the real payload and can corrupt playback. Blob already accepts the view itself.
🛠️ Proposed fix
export async function fetchTTSBlob(text: string, config: TTSConfig): Promise<Blob> {
const client = createClient(config);
const bytes = await client.synthToBytes(text);
- return new Blob([bytes.buffer as ArrayBuffer]);
+ return new Blob([bytes]);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const client = createClient(config); | |
| const bytes = await client.synthToBytes(text); | |
| return new Blob([bytes.buffer as ArrayBuffer]); | |
| export async function fetchTTSBlob(text: string, config: TTSConfig): Promise<Blob> { | |
| const client = createClient(config); | |
| const bytes = await client.synthToBytes(text); | |
| return new Blob([bytes]); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/tts.ts` around lines 29 - 31, The Blob construction uses bytes.buffer
which can include data outside the Uint8Array view; change the return to pass
the returned byte view directly into Blob (i.e., use new Blob([bytes]) instead
of new Blob([bytes.buffer])) after obtaining the bytes from
client.synthToBytes(text) in the function that calls createClient and
synthToBytes so sliced views keep correct byteOffset/byteLength.
…te (#99) After the js-tts-wrapper migration (PR #89), browsers with the service worker installed were serving old cached chunks that still called the removed /api/tts/speak route, causing 404 errors and TTS failure. Bumping CACHE to podium-v4 forces all clients to fetch fresh bundles. Added old-cache cleanup in the activate handler so stale caches are automatically deleted on each SW update going forward.
* fix: fetch audio cache segments sequentially to avoid rate limits (#97) * fix: bump service worker cache to v4 and evict stale caches on activate (#99) After the js-tts-wrapper migration (PR #89), browsers with the service worker installed were serving old cached chunks that still called the removed /api/tts/speak route, causing 404 errors and TTS failure. Bumping CACHE to podium-v4 forces all clients to fetch fresh bundles. Added old-cache cleanup in the activate handler so stale caches are automatically deleted on each SW update going forward. * fix: use package.json version as service worker cache key Instead of a manually-bumped hardcoded version string, the SW cache key is now derived from the app version: - next.config.ts exposes npm_package_version as NEXT_PUBLIC_APP_VERSION - layout.tsx registers /sw.js?v=<version> so the query string changes automatically on each release - sw.js reads self.location.search to build its cache key (podium-<version>) Cache busting now happens automatically whenever package.json version is bumped — no manual SW edits required. --------- Co-authored-by: Owen McGirr <o.a.mcgirr@gmail.com>
Summary
/api/tts/speak,/api/tts/voices) with direct browser calls usingAzureTTSClientandElevenLabsTTSClientfromjs-tts-wrapper/browserNotes
The upstream fix is in PR willwade/js-tts-wrapper#33. Once that is merged and published,
package.jsonshould be updated to reference the new npm version instead of the local path.Test plan
npm run build)js-tts-wrapperfix is published, update to npm versionCloses #88
Summary by CodeRabbit
Refactor
Chores