fix: GUI startup with external server + data refresh on server switch#319
fix: GUI startup with external server + data refresh on server switch#319
Conversation
Two fixes for issue #312: 1. GUI stuck on loading screen when backend is already running externally (e.g. via python/uvicorn/Docker): - Rust: add HTTP health check fallback when the process on the port doesn't have 'voicebox' in its name. If /health responds with a valid Voicebox response, reuse the server instead of erroring. - Frontend: when startServer() fails, fall back to polling the health endpoint every 2s instead of permanently blocking. 2. No data refresh when switching server URLs in settings: - serverStore.setServerUrl() now invalidates all React Query caches when the URL actually changes, so profiles/history/models/stories are re-fetched from the new server. - Export queryClient from main.tsx for store-level cache invalidation. Fixes #312
📝 Walkthrough🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/src/App.tsx (1)
121-143:⚠️ Potential issue | 🟠 MajorDon’t swallow real startup failures behind the polling fallback.
This
catchnow handles everystartServer()error. If the failure is a missing sidecar, signing issue, or real port conflict, the timeout only stops polling;serverReadystaysfalse, so the loading screen never exits. Please either gate this fallback to the “external server already running” case or set an explicit startup-error state when the 2-minute window expires.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/App.tsx` around lines 121 - 143, The catch block for startServer() currently always falls back to health-check polling and swallows real startup failures; update it to distinguish "external server already running" errors from genuine startup failures and set an explicit startup-error state if polling times out. Concretely: in the catch for startServer(), inspect the thrown error (from startServer()) and only start the apiClient.getHealth() polling when the error indicates an external/connection-refusal situation (e.g., ECONNREFUSED/connection refused or a well-known status); otherwise set serverStartingRef.current = false and set a new state like startupError (via setStartupError(true)) to surface the error. Additionally, when you do start polling, ensure the 2-minute timeout not only clears the interval but also sets startupError(true) (and serverStartingRef.current = false) so the UI can exit the loading state and show the failure; reference startServer(), serverStartingRef, window.__voiceboxServerStartedByApp, apiClient.getHealth(), setServerReady, and add setStartupError to implement this behavior.
🧹 Nitpick comments (1)
app/src/stores/serverStore.ts (1)
38-42: Move the sharedQueryClientinto a side-effect-free module.Importing
@/mainfrom a Zustand store couples cache invalidation to the React bootstrap entrypoint and is why this has to stay lazy. A tinyqueryClient.tsmodule would keep the same behavior without depending onReactDOM.createRoot(...).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/stores/serverStore.ts` around lines 38 - 42, The store's invalidateAllServerData currently lazy-imports '@/main' which couples the Zustand store to the React bootstrap; instead create a side-effect-free module (e.g., export a shared QueryClient instance from a new queryClient.ts) and use that directly: move the QueryClient creation into the new module (exporting a named queryClient), update invalidateAllServerData to import that queryClient and call queryClient.invalidateQueries(), and remove any dependency on '@/main' so invalidateAllServerData and the store remain side-effect-free and safe to import at module-evaluation time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/App.tsx`:
- Around line 131-137: The poll callback in App.tsx currently treats any
successful apiClient.getHealth() as a ready signal; change it to validate the
parsed health payload for Voicebox-specific fields (e.g., a version,
serviceName, or expected capability flag) before calling setServerReady(true)
and clearInterval(pollInterval). After awaiting apiClient.getHealth(), inspect
the returned JSON (from apiClient.getHealth), confirm the required keys/values
are present and match expected formats, only then log success and
setServerReady; otherwise log the mismatch and keep polling. Target symbols: the
setInterval handler using pollInterval, apiClient.getHealth(), and
setServerReady().
In `@tauri/src-tauri/src/main.rs`:
- Around line 73-75: Replace the fragile substring check on resp.text() with
proper JSON deserialization into the expected health shape: parse resp.text()
(or resp.json()) into the HealthResponse-like struct used by the backend
(referencing the HealthResponse shape in backend/models.py) and validate key
fields (e.g., status and any stable version/id fields) before accepting the port
as Voicebox; update the match arm that currently uses body.contains("status") to
attempt deserialization, handle errors by treating non-conforming responses as
invalid, and only return true when the parsed object matches the expected
fields/values.
---
Outside diff comments:
In `@app/src/App.tsx`:
- Around line 121-143: The catch block for startServer() currently always falls
back to health-check polling and swallows real startup failures; update it to
distinguish "external server already running" errors from genuine startup
failures and set an explicit startup-error state if polling times out.
Concretely: in the catch for startServer(), inspect the thrown error (from
startServer()) and only start the apiClient.getHealth() polling when the error
indicates an external/connection-refusal situation (e.g.,
ECONNREFUSED/connection refused or a well-known status); otherwise set
serverStartingRef.current = false and set a new state like startupError (via
setStartupError(true)) to surface the error. Additionally, when you do start
polling, ensure the 2-minute timeout not only clears the interval but also sets
startupError(true) (and serverStartingRef.current = false) so the UI can exit
the loading state and show the failure; reference startServer(),
serverStartingRef, window.__voiceboxServerStartedByApp, apiClient.getHealth(),
setServerReady, and add setStartupError to implement this behavior.
---
Nitpick comments:
In `@app/src/stores/serverStore.ts`:
- Around line 38-42: The store's invalidateAllServerData currently lazy-imports
'@/main' which couples the Zustand store to the React bootstrap; instead create
a side-effect-free module (e.g., export a shared QueryClient instance from a new
queryClient.ts) and use that directly: move the QueryClient creation into the
new module (exporting a named queryClient), update invalidateAllServerData to
import that queryClient and call queryClient.invalidateQueries(), and remove any
dependency on '@/main' so invalidateAllServerData and the store remain
side-effect-free and safe to import at module-evaluation time.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1593773e-6a50-49d2-9952-b6e289c49d11
📒 Files selected for processing (4)
app/src/App.tsxapp/src/main.tsxapp/src/stores/serverStore.tstauri/src-tauri/src/main.rs
failed generations, giving users a way to clean up failed entries without having to retry them first.
…ons-292 fix/allows deletion of failed generations 292
Add Kokoro-82M as a new TTS engine — 82M params, CPU realtime, 8 languages, Apache 2.0. Unlike cloning engines, Kokoro uses pre-built voice styles, which required a new profile type system to support non-cloning engines cleanly. Kokoro engine: - New kokoro_backend.py implementing TTSBackend protocol - 50 built-in voices across en/es/fr/hi/it/pt/ja/zh - KPipeline API with language-aware G2P routing via misaki - PyInstaller bundling for misaki, language_tags, espeakng_loader, en_core_web_sm Voice profile type system: - New voice_type column: 'cloned' | 'preset' | 'designed' (future) - Preset profiles store engine + voice ID instead of audio samples - default_engine field on profiles — auto-selects engine on profile pick - Create Voice dialog: toggle between 'Clone from audio' and 'Built-in voice' - Edit dialog shows preset voice info instead of sample list for preset profiles - Engine selector locks to preset engine when preset profile is selected - Profile grid filters by engine — shows Kokoro voices when Kokoro selected - Custom empty state when no preset profiles exist for selected engine Bug fixes: - Fix relative audio paths in DB causing 404s in production builds - config.set_data_dir() now resolves to absolute paths - Startup migration converts existing relative paths to absolute Also updates PROJECT_STATUS.md and tts-engines.mdx developer guide.
feat: Kokoro 82M TTS engine + voice profile type system
…ing, queryClient decoupling
- Rust: Replace fragile body.contains("status") with proper JSON
deserialization validating status=="healthy", model_loaded (bool),
and gpu_available (bool) to prevent misidentifying non-Voicebox services
- Frontend: Validate health response has Voicebox-specific fields before
marking server as ready during fallback polling
- Frontend: Discriminate port-in-use errors (poll for external server) from
real startup failures (missing sidecar, signing issues) — surface errors
immediately with a startupError state and Retry button in the UI
- Frontend: Set explicit startup-error state when 2-minute polling timeout
expires so the loading screen shows actionable feedback instead of hanging
- Architecture: Extract QueryClient to standalone side-effect-free module
(lib/queryClient.ts) to decouple serverStore from React bootstrap entrypoint
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/services/history.py (1)
281-291:⚠️ Potential issue | 🟠 MajorMirror version cleanup in the profile-wide delete path.
delete_generation()explicitly removes version artifacts first, butdelete_generations_by_profile()only unlinksgeneration.audio_path. Any derived version audio files survive profile deletion, and version-row cleanup is left to FK behavior instead of the explicit path used elsewhere.Suggested change
async def delete_generations_by_profile( profile_id: str, db: Session, ) -> int: @@ + from . import versions as versions_mod generations = db.query(DBGeneration).filter_by(profile_id=profile_id).all() count = 0 for generation in generations: + versions_mod.delete_versions_for_generation(generation.id, db) # Delete audio file audio_path = config.resolve_storage_path(generation.audio_path) if audio_path is not None and audio_path.exists(): audio_path.unlink()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/history.py` around lines 281 - 291, The profile-wide deletion currently only unlinks generation.audio_path and deletes generation rows, leaving derived version audio files and explicit version cleanup to cascade; update delete_generations_by_profile() to perform the same cleanup as delete_generation(): for each generation either call the existing delete_generation(...) helper for that generation or explicitly iterate generation.versions to resolve and unlink each version.audio_path (via config.resolve_storage_path) and delete each version row before deleting the parent generation, ensuring no derived audio files or version rows are left behind (referencing delete_generation, delete_generations_by_profile, generation.audio_path, and generation.versions).backend/services/export_import.py (1)
284-294:⚠️ Potential issue | 🟠 MajorPotential
AttributeErrorifresolve_storage_pathreturnsNone.At line 284,
v_pathcould beNoneif the audio path doesn't resolve. However, line 293 accessesv_path.nameunconditionally during manifest building, which occurs before the existence check at lines 320-321. This will raiseAttributeError: 'NoneType' object has no attribute 'name'.🐛 Proposed fix: Guard against None before accessing attributes
for v in versions: v_path = config.resolve_storage_path(v.audio_path) + if v_path is None: + continue # Skip versions with unresolvable paths effects_chain = None if v.effects_chain: effects_chain = json.loads(v.effects_chain) version_entries.append({ "id": v.id, "label": v.label, "is_default": v.is_default, "effects_chain": effects_chain, "filename": v_path.name, })Note: You may also want to log a warning when skipping a version with an unresolvable path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/export_import.py` around lines 284 - 294, When building version_entries, guard against resolve_storage_path returning None before accessing v_path.name: call config.resolve_storage_path(v.audio_path) into v_path, check if v_path is truthy (or not None) and only then append the dict to version_entries using v_path.name; if v_path is None, skip that version (and optionally log a warning mentioning v.id and v.audio_path). Also preserve the existing effects_chain parsing (json.loads on v.effects_chain) only for versions you will append, so move the effects_chain logic inside the guarded block to avoid unnecessary work or errors.
🧹 Nitpick comments (16)
docs/notes/PROJECT_STATUS.md (1)
207-207: Optional: Minor wording refinement.The static analysis tool suggests considering a stronger verb than "fix" in the commit message. However, "fix" is perfectly acceptable and follows conventional commit message standards for corrections.
If you prefer slightly more specific wording, you could use:
- "docs: correct README grammar"
- "docs: improve README grammar"
But the current wording is fine as-is.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/notes/PROJECT_STATUS.md` at line 207, The commit entry for **#230** currently uses the verb "fix" ("docs: fix README grammar"); if you want a slightly stronger or more specific verb, update that line in PROJECT_STATUS.md to use "docs: correct README grammar" or "docs: improve README grammar" (or leave as-is since "fix" is acceptable); locate the table row containing **#230** and replace the commit message cell with the preferred phrasing.docs/plans/API_REFACTOR_PLAN.md (3)
422-422: Optional: Simplify phrasing."All of the following" can be shortened to "all the following" for conciseness.
✏️ Minor edit
-The refactor can be considered complete when all of the following are true: +The refactor can be considered complete when all the following are true:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/API_REFACTOR_PLAN.md` at line 422, Replace the phrase "The refactor can be considered complete when all of the following are true:" with the more concise "The refactor can be considered complete when all the following are true:" by editing the sentence that currently reads exactly "The refactor can be considered complete when all of the following are true:" in the API_REFACTOR_PLAN.md content.
240-242: Unresolved design decision on sample resource nesting.Line 240 presents two options for sample endpoints without indicating which will be chosen:
- Nested:
GET /profiles/{profile_id}/samples/{sample_id}- Top-level:
GET /samples/{sample_id}This is likely intentional for a planning document, but ensure this decision is resolved before Phase 4 implementation begins.
💡 Suggestion: Document decision criteria
Consider adding a brief note about the trade-offs:
- Nested resources enforce profile ownership in the URL structure
- Top-level resources simplify cross-profile sample operations
-`GET /profiles/{profile_id}/samples/{sample_id}` or `GET /samples/{sample_id}` as a consciously chosen model +`GET /profiles/{profile_id}/samples/{sample_id}` (preferred if samples are tightly coupled to profiles) or `GET /samples/{sample_id}` (if samples can be accessed independently)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/API_REFACTOR_PLAN.md` around lines 240 - 242, The plan currently leaves the sample endpoint nesting undecided between GET /profiles/{profile_id}/samples/{sample_id} and GET /samples/{sample_id}; before Phase 4, choose and document the decision and criteria: update API_REFACTOR_PLAN.md to state the selected pattern (nested or top-level), add a short rationale referencing trade-offs (ownership enforcement vs cross-profile simplicity), and list the chosen aliases mapping (e.g., PUT/DELETE as aliases) so implementers know which handlers/routing (samples resource vs profile-samples resource) to implement.
347-352: Minor: Improve list item variety.Three consecutive bullets begin with "Add" (lines 349-351), which reduces readability slightly.
📝 Suggested rewording
- Add route equivalence tests for old and new paths. - Add schema snapshot tests for OpenAPI generation. -- Add response-shape tests for common mutations and async workflows. -- Add contract tests for `202 Accepted` flows. +- Verify response shapes for common mutations and async workflows. +- Cover `202 Accepted` flows with contract tests.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/API_REFACTOR_PLAN.md` around lines 347 - 352, The three consecutive bullets in the plan all start with "Add" which hurts readability; update the verbs to vary phrasing by keeping the first bullet as-is for "route equivalence tests for old and new paths" and reword the others to use different verbs (for example: "Capture schema snapshots for OpenAPI generation" instead of "Add schema snapshot tests", "Validate response shapes for common mutations and async workflows" instead of "Add response-shape tests", and "Cover `202 Accepted` flows with contract tests" instead of "Add contract tests for `202 Accepted` flows") so the list reads more smoothly while preserving the same intents.backend/database/seed.py (1)
29-36: Write the normalized path into the backfilled version row.
resolved_audio_pathis already the canonical file location. Reusinggen.audio_pathhere preserves whatever legacy format is currently stored onGeneration, so newly createdGenerationVersionrows can still end up mixed with the migrated data.Suggested change
version = GenerationVersion( id=str(uuid.uuid4()), generation_id=gen.id, label="clean", - audio_path=gen.audio_path, + audio_path=config.to_storage_path(resolved_audio_path), effects_chain=None, is_default=True, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/database/seed.py` around lines 29 - 36, The GenerationVersion being created uses the legacy gen.audio_path instead of the canonical path; update the GenerationVersion construction (in the seed routine where resolved_audio_path is computed via config.resolve_storage_path(gen.audio_path)) to write the normalized resolved_audio_path into the new row (e.g., convert resolved_audio_path to a string) instead of gen.audio_path so backfilled GenerationVersion.audio_path contains the migrated/canonical path.app/src/components/Generation/GenerationForm.tsx (1)
30-35: Centralize the engine → default selector mapping.This hard-codes the default model size for
qwen,qwen_custom_voice, andtadain a second place. If the selector defaults change later, selecting a profile here will silently force stale values. Prefer keeping that mapping next toapplyEngineSelection()so both paths share one source of truth.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/GenerationForm.tsx` around lines 30 - 35, The getEngineSelectValue function duplicates the engine→default model-size mapping currently defined elsewhere (applyEngineSelection); centralize that mapping by extracting a single shared map/constant (e.g., ENGINE_DEFAULT_MAP) and use it from both getEngineSelectValue and applyEngineSelection so changes apply in one place; update getEngineSelectValue to look up engine in ENGINE_DEFAULT_MAP and fall back to the raw engine string, and update applyEngineSelection to reference the same ENGINE_DEFAULT_MAP instead of a hard-coded list.backend/routes/generations.py (2)
243-246: Use exception chaining here as well.Same issue as line 46: the exception should be chained with
from e.♻️ Add exception chaining
try: profiles.validate_profile_engine(profile, engine) except ValueError as e: - raise HTTPException(status_code=400, detail=str(e)) + raise HTTPException(status_code=400, detail=str(e)) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/generations.py` around lines 243 - 246, The except block around profiles.validate_profile_engine(profile, engine) should chain the original ValueError into the raised HTTPException; change the raise HTTPException(...) to include "from e" so the HTTPException is raised as "raise HTTPException(status_code=400, detail=str(e)) from e", preserving the original traceback for debugging while keeping the same status_code and detail.
43-46: Use exception chaining for better debugging.Per static analysis (Ruff B904): When re-raising an exception within an
exceptclause, chain it withfrom eto preserve the original traceback.♻️ Add exception chaining
try: profiles.validate_profile_engine(profile, engine) except ValueError as e: - raise HTTPException(status_code=400, detail=str(e)) + raise HTTPException(status_code=400, detail=str(e)) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/generations.py` around lines 43 - 46, In the except block that catches ValueError from profiles.validate_profile_engine(profile, engine), preserve the original traceback by chaining the exception when re-raising the HTTPException; i.e., change the raise HTTPException(status_code=400, detail=str(e)) to raise HTTPException(status_code=400, detail=str(e)) from e so the original ValueError traceback is retained for debugging.backend/voicebox-server.spec (1)
8-8: Consider splitting thehiddenimportslist for maintainability.The single-line
hiddenimportslist is now extremely long, making it difficult to review, diff, or maintain. Consider splitting into multiple lines or grouping by category.♻️ Example multi-line format
hiddenimports = [ # Core backend 'backend', 'backend.main', 'backend.config', 'backend.database', 'backend.models', # Services 'backend.services.profiles', 'backend.services.history', 'backend.services.tts', # Backends 'backend.backends', 'backend.backends.pytorch_backend', 'backend.backends.qwen_custom_voice_backend', 'backend.backends.kokoro_backend', 'backend.backends.chatterbox_backend', # ... etc ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/voicebox-server.spec` at line 8, The hiddenimports assignment is a single very long line; split the hiddenimports variable into a multi-line list literal (hiddenimports = [ ... ]) and group entries into logical categories (e.g., "Core backend", "Services", "Backends", "3rd-party libs") while preserving every string value and original order, add trailing commas for each entry, and keep the variable name hiddenimports unchanged so import hooks still see the same entries.app/src/lib/hooks/useGenerationForm.ts (2)
112-120: Deeply nested ternaries reduce readability.The
displayNamederivation now has 7 levels of nested ternary operators. While functionally correct, this is hard to maintain. Consider extracting to a lookup object or helper function.♻️ Example refactor using a lookup
const ENGINE_DISPLAY_NAMES: Record<string, string | ((size?: string) => string)> = { luxtts: 'LuxTTS', chatterbox: 'Chatterbox TTS', chatterbox_turbo: 'Chatterbox Turbo', tada: (size) => (size === '3B' ? 'TADA 3B Multilingual' : 'TADA 1B'), kokoro: 'Kokoro 82M', qwen_custom_voice: (size) => (size === '1.7B' ? 'Qwen CustomVoice 1.7B' : 'Qwen CustomVoice 0.6B'), qwen: (size) => (size === '1.7B' ? 'Qwen TTS 1.7B' : 'Qwen TTS 0.6B'), }; const displayNameEntry = ENGINE_DISPLAY_NAMES[engine] ?? ENGINE_DISPLAY_NAMES.qwen; const displayName = typeof displayNameEntry === 'function' ? displayNameEntry(data.modelSize) : displayNameEntry;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/lib/hooks/useGenerationForm.ts` around lines 112 - 120, The nested ternary used to compute displayName in useGenerationForm.ts is hard to read and maintain; replace it with a lookup mapping or small helper function keyed by engine that returns either a string or a size-aware function, then compute displayName by resolving ENGINE_DISPLAY_NAMES[engine] (falling back to the qwen entry) and invoking it with data.modelSize when needed; update the code around the existing engine and data.modelSize usage to use this new lookup/helper and remove the deeply nested ternaries.
96-100: Consider guarding against undefinedmodelSizeforqwen_custom_voice.When
engine === 'qwen_custom_voice', the model name is constructed as`qwen-custom-voice-${data.modelSize}`. IfmodelSizeisundefined, this produces"qwen-custom-voice-undefined". While the form defaults to'1.7B', if the form is programmatically manipulated or defaults change, this could cause issues.♻️ Defensive fallback
: engine === 'qwen_custom_voice' - ? `qwen-custom-voice-${data.modelSize}` + ? `qwen-custom-voice-${data.modelSize || '1.7B'}` : `qwen-tts-${data.modelSize}`;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/lib/hooks/useGenerationForm.ts` around lines 96 - 100, The model name construction in useGenerationForm (the ternary that builds `qwen-custom-voice-${data.modelSize}` when engine === 'qwen_custom_voice') can produce "qwen-custom-voice-undefined"; update that branch to guard against an undefined `data.modelSize` by using a fallback (e.g. the current default '1.7B' or a shared DEFAULT_MODEL_SIZE constant) so the result is `qwen-custom-voice-<fallback>` when `data.modelSize` is falsy or undefined; ensure the same defensive fallback is applied only for the `qwen_custom_voice` branch and leave other branches unchanged.backend/models.py (1)
18-22: Consider adding pattern validation topreset_engineanddefault_engine.These fields accept any string up to 50 characters, but
GenerationRequest.engine(line 81) restricts engines to a specific set. If a profile is created with an invaliddefault_engine, it could cause runtime issues when the engine is resolved in_resolve_generation_engine.♻️ Add pattern validation to match allowed engines
- preset_engine: Optional[str] = Field(None, max_length=50) + preset_engine: Optional[str] = Field(None, pattern="^(qwen|qwen_custom_voice|luxtts|chatterbox|chatterbox_turbo|tada|kokoro)$") preset_voice_id: Optional[str] = Field(None, max_length=100) design_prompt: Optional[str] = Field(None, max_length=2000) - default_engine: Optional[str] = Field(None, max_length=50) + default_engine: Optional[str] = Field(None, pattern="^(qwen|qwen_custom_voice|luxtts|chatterbox|chatterbox_turbo|tada|kokoro)$")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/models.py` around lines 18 - 22, Add the same allowed-engine pattern validation to the Profile model's preset_engine and default_engine Fields so they reject invalid engine names at creation: update the Field(...) for preset_engine and default_engine (symbols: preset_engine, default_engine) to include a pattern that matches the same allowed values used by GenerationRequest.engine, ensuring validation happens before _resolve_generation_engine runs; keep max_length and optional behavior and ensure the pattern string mirrors the GenerationRequest.engine allowed set exactly.app/src/App.tsx (1)
171-194: Clear the timeout when polling succeeds to avoid orphaned callbacks.When the health check succeeds and
clearInterval(pollInterval)is called, the 2-minute timeout (setTimeoutat line 187) remains scheduled. While it won't cause visible issues (the interval is already cleared), it's cleaner to cancel it.♻️ Track and clear the timeout
console.log('Falling back to health-check polling...'); + let pollTimeoutId: ReturnType<typeof setTimeout>; const pollInterval = setInterval(async () => { try { const health = await apiClient.getHealth(); if (!isVoiceboxHealthResponse(health)) { console.log('Health response is not from a Voicebox server, keep polling...'); return; } console.log('External Voicebox server detected via health check'); clearInterval(pollInterval); + clearTimeout(pollTimeoutId); setServerReady(true); } catch { // Server not ready yet, keep polling } }, 2000); // Stop polling after 2 minutes and surface the failure - setTimeout(() => { + pollTimeoutId = setTimeout(() => { clearInterval(pollInterval); serverStartingRef.current = false; setStartupError( 'Could not connect to a Voicebox server within 2 minutes. ' + 'Please check that the server is running and try again.', ); }, 120_000);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/App.tsx` around lines 171 - 194, The 2-minute timeout started with setTimeout is never cleared when polling succeeds; store its return value in a variable (e.g., startupTimeout) and call clearTimeout(startupTimeout) in the success path where you already call clearInterval(pollInterval) and setServerReady(true). Update the existing timeout creation to assign it to startupTimeout and add clearTimeout(startupTimeout) alongside clearInterval(pollInterval) inside the async poll success branch (the block referencing pollInterval and setServerReady).backend/services/profiles.py (1)
521-535: Minor duplication: preset validation repeated aftervalidate_profile_engine.
validate_profile_engine(line 521) already validates preset profiles (checkingpreset_engineandpreset_voice_idexist and match the requested engine). The checks at lines 525-530 repeat this validation.♻️ Simplified version
validate_profile_engine(profile, engine) # ── Preset profiles: return engine-specific voice reference ── if voice_type == "preset": - if not profile.preset_engine or not profile.preset_voice_id: - raise ValueError(f"Preset profile {profile_id} is missing preset engine metadata") - if profile.preset_engine != engine: - raise ValueError( - f"Preset profile {profile_id} only supports engine '{profile.preset_engine}', not '{engine}'" - ) + # validate_profile_engine already verified preset_engine/preset_voice_id exist and match return { "voice_type": "preset", "preset_engine": profile.preset_engine, "preset_voice_id": profile.preset_voice_id, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/profiles.py` around lines 521 - 535, The preset validation immediately after calling validate_profile_engine is duplicated; remove the redundant checks that re-verify profile.preset_engine and profile.preset_voice_id and the engine mismatch check, and simply rely on validate_profile_engine(profile, engine) having performed those validations; keep returning the existing dict with keys "voice_type", "preset_engine", and "preset_voice_id" when voice_type == "preset". Ensure you reference the same symbols (validate_profile_engine, voice_type, profile.preset_engine, profile.preset_voice_id) so the function remains concise and correct.app/src/components/Generation/EngineModelSelector.tsx (1)
124-128: Consider removingformfrom dependency array or wrapping inuseCallback.Including
formin the dependency array is typically safe sinceuseFormreturns a stable reference, butapplyEngineSelectionis called withforminside the effect. If the component re-renders frequently, this could cause unnecessary effect executions. Consider memoizing or extracting the selection logic.♻️ Suggested improvement
useEffect(() => { if (!currentEngineAvailable && availableOptions.length > 0) { applyEngineSelection(form, availableOptions[0].value); } - }, [availableOptions, currentEngineAvailable, form]); + }, [availableOptions, currentEngineAvailable]); // form is stable from useForm🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 124 - 128, The effect calling applyEngineSelection is re-running unnecessarily because form is included in the dependency array; either remove form from the dependency list (keeping [availableOptions, currentEngineAvailable]) if form is a stable reference from useForm, or memoize the selection callback by wrapping applyEngineSelection (or the inline selection logic) in useCallback so its identity is stable; update references to useEffect([availableOptions, currentEngineAvailable]) or useCallback for applyEngineSelection to prevent redundant effect executions while still ensuring availableOptions[0].value is applied when currentEngineAvailable becomes false.backend/backends/kokoro_backend.py (1)
281-283: Add logging when returning silence fallback.Returning 1 second of silence when no audio chunks are produced indicates a problem (empty text, pipeline failure, etc.). Adding a warning log would help diagnose issues without masking failures silently.
🔊 Suggested improvement
if not audio_chunks: # Return 1 second of silence as fallback + logger.warning("Kokoro produced no audio chunks for text: %s", text[:100]) return np.zeros(KOKORO_SAMPLE_RATE, dtype=np.float32), KOKORO_SAMPLE_RATE🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/kokoro_backend.py` around lines 281 - 283, Add a warning log before returning the 1-second silence fallback so failures aren't silent: inside the block that checks "if not audio_chunks" (in backend/backends/kokoro_backend.py where audio_chunks is evaluated), call the module or class logger (e.g., logger.warning or self.logger.warning depending on the file's logging pattern) with a brief message including context (e.g., "No audio_chunks produced, returning 1s silence") and any relevant identifiers (such as KOKORO_SAMPLE_RATE or input text id) and then return the np.zeros(...) and KOKORO_SAMPLE_RATE as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/components/Generation/FloatingGenerateBox.tsx`:
- Around line 132-143: The type assertion for selectedProfile?.default_engine
used when auto-switching the engine (the form.setValue('engine',
selectedProfile.default_engine as ... ) in FloatingGenerateBox) omits
'qwen_custom_voice'; update that union type to include 'qwen_custom_voice' so
the cast matches backend values (i.e., add 'qwen_custom_voice' to the union of
engine literal types used in the form.setValue call).
In `@app/src/components/Generation/GenerationForm.tsx`:
- Around line 43-56: The effect currently depends on the whole selectedProfile
object which causes reruns when the query refetches; update the dependency array
to depend only on selectedProfile?.id (and form) so the effect only runs when
the selected profile identity changes; keep the same logic inside the effect
(calls to form.setValue and applyEngineSelection/getEngineSelectValue) and
reference selectedProfile.id when reading preferred_engine/default_engine and
language to avoid using the full object as a dependency.
In `@app/src/components/VoiceProfiles/ProfileCard.tsx`:
- Around line 106-110: The preset badge is rendered for all profiles with
profile.voice_type === 'preset' even when profile.preset_engine is missing,
producing an empty badge; in ProfileCard update the conditional to also require
a non-empty profile.preset_engine (e.g. profile.voice_type === 'preset' &&
profile.preset_engine) before rendering the Badge and use
ENGINE_DISPLAY_NAMES[profile.preset_engine] ?? profile.preset_engine as the
label so legacy/partially migrated records without preset_engine do not show an
empty badge.
In `@app/src/components/VoiceProfiles/ProfileForm.tsx`:
- Around line 875-881: The code blindly casts voice.language to LanguageCode in
the onClick handler (setSelectedPresetVoiceId / form.setValue), which can inject
invalid values; instead validate voice.language against your canonical
LanguageCode set (e.g., an allowedLanguages array or an isValidLanguageCode
helper) and only call form.setValue('language', validatedValue) when it matches,
otherwise fall back to a safe default or omit setting the field; update the
onClick flow to perform this lookup/validation before setting the form value to
avoid invalid casting.
In `@backend/backends/qwen_custom_voice_backend.py`:
- Around line 186-207: The preset speaker value from
voice_prompt.get("preset_voice_id") may be invalid and the call to
self.model.generate_custom_voice can return an empty wavs list; in
_generate_sync validate the speaker before use (e.g., check that speaker is in
the allowed voices list or matches expected format and if not, set speaker =
QWEN_CV_DEFAULT_SPEAKER) and after calling self.model.generate_custom_voice
check that wavs is non-empty before using wavs[0] — if empty, raise a clear
exception or return a defined fallback (and include context like the speaker and
kwargs in the error). Ensure these checks live inside _generate_sync so
generate_custom_voice is always invoked with a valid speaker and its output is
safely handled.
In `@backend/config.py`:
- Around line 83-90: The current logic in the absolute-path branch (using
stored_path, rebased_path and _path_relative_to_any_data_dir) falls back to
returning the original absolute stored_path when rebasing fails, which can
expose files outside _data_dir; change this so that if rebased_path is None you
do NOT return stored_path even if it exists — instead log a warning/error and
raise an exception (e.g., ValueError or a custom SecurityError) or return None
to force explicit migration, ensuring only candidate paths under _data_dir
(constructed from _data_dir / rebased_path and checked with candidate.exists())
are accepted; update any callers to handle the new failure mode.
- Around line 25-37: The function _path_relative_to_any_data_dir currently
returns Path() when the input path is exactly "data", which can silently produce
a joined path equal to the data dir; change the behavior so that when no tail
exists (i.e., the path equals "data") the function returns None instead of
Path(), and update the docstring to state that a None return means "path refers
to the data directory itself" (also audit callers of
_path_relative_to_any_data_dir to handle None appropriately).
In `@backend/routes/profiles.py`:
- Around line 3-7: The module is missing the datetime import used by the PUT
/profiles/{profile_id}/effects handler (it calls datetime.utcnow()); add back a
direct import like "from datetime import datetime" into the top-level import
block in profiles.py so datetime.utcnow() resolves, keeping the existing imports
(e.g., json as _json) unchanged.
---
Outside diff comments:
In `@backend/services/export_import.py`:
- Around line 284-294: When building version_entries, guard against
resolve_storage_path returning None before accessing v_path.name: call
config.resolve_storage_path(v.audio_path) into v_path, check if v_path is truthy
(or not None) and only then append the dict to version_entries using
v_path.name; if v_path is None, skip that version (and optionally log a warning
mentioning v.id and v.audio_path). Also preserve the existing effects_chain
parsing (json.loads on v.effects_chain) only for versions you will append, so
move the effects_chain logic inside the guarded block to avoid unnecessary work
or errors.
In `@backend/services/history.py`:
- Around line 281-291: The profile-wide deletion currently only unlinks
generation.audio_path and deletes generation rows, leaving derived version audio
files and explicit version cleanup to cascade; update
delete_generations_by_profile() to perform the same cleanup as
delete_generation(): for each generation either call the existing
delete_generation(...) helper for that generation or explicitly iterate
generation.versions to resolve and unlink each version.audio_path (via
config.resolve_storage_path) and delete each version row before deleting the
parent generation, ensuring no derived audio files or version rows are left
behind (referencing delete_generation, delete_generations_by_profile,
generation.audio_path, and generation.versions).
---
Nitpick comments:
In `@app/src/App.tsx`:
- Around line 171-194: The 2-minute timeout started with setTimeout is never
cleared when polling succeeds; store its return value in a variable (e.g.,
startupTimeout) and call clearTimeout(startupTimeout) in the success path where
you already call clearInterval(pollInterval) and setServerReady(true). Update
the existing timeout creation to assign it to startupTimeout and add
clearTimeout(startupTimeout) alongside clearInterval(pollInterval) inside the
async poll success branch (the block referencing pollInterval and
setServerReady).
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 124-128: The effect calling applyEngineSelection is re-running
unnecessarily because form is included in the dependency array; either remove
form from the dependency list (keeping [availableOptions,
currentEngineAvailable]) if form is a stable reference from useForm, or memoize
the selection callback by wrapping applyEngineSelection (or the inline selection
logic) in useCallback so its identity is stable; update references to
useEffect([availableOptions, currentEngineAvailable]) or useCallback for
applyEngineSelection to prevent redundant effect executions while still ensuring
availableOptions[0].value is applied when currentEngineAvailable becomes false.
In `@app/src/components/Generation/GenerationForm.tsx`:
- Around line 30-35: The getEngineSelectValue function duplicates the
engine→default model-size mapping currently defined elsewhere
(applyEngineSelection); centralize that mapping by extracting a single shared
map/constant (e.g., ENGINE_DEFAULT_MAP) and use it from both
getEngineSelectValue and applyEngineSelection so changes apply in one place;
update getEngineSelectValue to look up engine in ENGINE_DEFAULT_MAP and fall
back to the raw engine string, and update applyEngineSelection to reference the
same ENGINE_DEFAULT_MAP instead of a hard-coded list.
In `@app/src/lib/hooks/useGenerationForm.ts`:
- Around line 112-120: The nested ternary used to compute displayName in
useGenerationForm.ts is hard to read and maintain; replace it with a lookup
mapping or small helper function keyed by engine that returns either a string or
a size-aware function, then compute displayName by resolving
ENGINE_DISPLAY_NAMES[engine] (falling back to the qwen entry) and invoking it
with data.modelSize when needed; update the code around the existing engine and
data.modelSize usage to use this new lookup/helper and remove the deeply nested
ternaries.
- Around line 96-100: The model name construction in useGenerationForm (the
ternary that builds `qwen-custom-voice-${data.modelSize}` when engine ===
'qwen_custom_voice') can produce "qwen-custom-voice-undefined"; update that
branch to guard against an undefined `data.modelSize` by using a fallback (e.g.
the current default '1.7B' or a shared DEFAULT_MODEL_SIZE constant) so the
result is `qwen-custom-voice-<fallback>` when `data.modelSize` is falsy or
undefined; ensure the same defensive fallback is applied only for the
`qwen_custom_voice` branch and leave other branches unchanged.
In `@backend/backends/kokoro_backend.py`:
- Around line 281-283: Add a warning log before returning the 1-second silence
fallback so failures aren't silent: inside the block that checks "if not
audio_chunks" (in backend/backends/kokoro_backend.py where audio_chunks is
evaluated), call the module or class logger (e.g., logger.warning or
self.logger.warning depending on the file's logging pattern) with a brief
message including context (e.g., "No audio_chunks produced, returning 1s
silence") and any relevant identifiers (such as KOKORO_SAMPLE_RATE or input text
id) and then return the np.zeros(...) and KOKORO_SAMPLE_RATE as before.
In `@backend/database/seed.py`:
- Around line 29-36: The GenerationVersion being created uses the legacy
gen.audio_path instead of the canonical path; update the GenerationVersion
construction (in the seed routine where resolved_audio_path is computed via
config.resolve_storage_path(gen.audio_path)) to write the normalized
resolved_audio_path into the new row (e.g., convert resolved_audio_path to a
string) instead of gen.audio_path so backfilled GenerationVersion.audio_path
contains the migrated/canonical path.
In `@backend/models.py`:
- Around line 18-22: Add the same allowed-engine pattern validation to the
Profile model's preset_engine and default_engine Fields so they reject invalid
engine names at creation: update the Field(...) for preset_engine and
default_engine (symbols: preset_engine, default_engine) to include a pattern
that matches the same allowed values used by GenerationRequest.engine, ensuring
validation happens before _resolve_generation_engine runs; keep max_length and
optional behavior and ensure the pattern string mirrors the
GenerationRequest.engine allowed set exactly.
In `@backend/routes/generations.py`:
- Around line 243-246: The except block around
profiles.validate_profile_engine(profile, engine) should chain the original
ValueError into the raised HTTPException; change the raise HTTPException(...) to
include "from e" so the HTTPException is raised as "raise
HTTPException(status_code=400, detail=str(e)) from e", preserving the original
traceback for debugging while keeping the same status_code and detail.
- Around line 43-46: In the except block that catches ValueError from
profiles.validate_profile_engine(profile, engine), preserve the original
traceback by chaining the exception when re-raising the HTTPException; i.e.,
change the raise HTTPException(status_code=400, detail=str(e)) to raise
HTTPException(status_code=400, detail=str(e)) from e so the original ValueError
traceback is retained for debugging.
In `@backend/services/profiles.py`:
- Around line 521-535: The preset validation immediately after calling
validate_profile_engine is duplicated; remove the redundant checks that
re-verify profile.preset_engine and profile.preset_voice_id and the engine
mismatch check, and simply rely on validate_profile_engine(profile, engine)
having performed those validations; keep returning the existing dict with keys
"voice_type", "preset_engine", and "preset_voice_id" when voice_type ==
"preset". Ensure you reference the same symbols (validate_profile_engine,
voice_type, profile.preset_engine, profile.preset_voice_id) so the function
remains concise and correct.
In `@backend/voicebox-server.spec`:
- Line 8: The hiddenimports assignment is a single very long line; split the
hiddenimports variable into a multi-line list literal (hiddenimports = [ ... ])
and group entries into logical categories (e.g., "Core backend", "Services",
"Backends", "3rd-party libs") while preserving every string value and original
order, add trailing commas for each entry, and keep the variable name
hiddenimports unchanged so import hooks still see the same entries.
In `@docs/notes/PROJECT_STATUS.md`:
- Line 207: The commit entry for **#230** currently uses the verb "fix" ("docs:
fix README grammar"); if you want a slightly stronger or more specific verb,
update that line in PROJECT_STATUS.md to use "docs: correct README grammar" or
"docs: improve README grammar" (or leave as-is since "fix" is acceptable);
locate the table row containing **#230** and replace the commit message cell
with the preferred phrasing.
In `@docs/plans/API_REFACTOR_PLAN.md`:
- Line 422: Replace the phrase "The refactor can be considered complete when all
of the following are true:" with the more concise "The refactor can be
considered complete when all the following are true:" by editing the sentence
that currently reads exactly "The refactor can be considered complete when all
of the following are true:" in the API_REFACTOR_PLAN.md content.
- Around line 240-242: The plan currently leaves the sample endpoint nesting
undecided between GET /profiles/{profile_id}/samples/{sample_id} and GET
/samples/{sample_id}; before Phase 4, choose and document the decision and
criteria: update API_REFACTOR_PLAN.md to state the selected pattern (nested or
top-level), add a short rationale referencing trade-offs (ownership enforcement
vs cross-profile simplicity), and list the chosen aliases mapping (e.g.,
PUT/DELETE as aliases) so implementers know which handlers/routing (samples
resource vs profile-samples resource) to implement.
- Around line 347-352: The three consecutive bullets in the plan all start with
"Add" which hurts readability; update the verbs to vary phrasing by keeping the
first bullet as-is for "route equivalence tests for old and new paths" and
reword the others to use different verbs (for example: "Capture schema snapshots
for OpenAPI generation" instead of "Add schema snapshot tests", "Validate
response shapes for common mutations and async workflows" instead of "Add
response-shape tests", and "Cover `202 Accepted` flows with contract tests"
instead of "Add contract tests for `202 Accepted` flows") so the list reads more
smoothly while preserving the same intents.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ac5bb17e-34f7-41c7-8c4b-08b461caaca1
⛔ Files ignored due to path filters (1)
tauri/src-tauri/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (43)
app/src/App.tsxapp/src/components/Generation/EngineModelSelector.tsxapp/src/components/Generation/FloatingGenerateBox.tsxapp/src/components/Generation/GenerationForm.tsxapp/src/components/History/HistoryTable.tsxapp/src/components/ServerSettings/ModelManagement.tsxapp/src/components/VoiceProfiles/ProfileCard.tsxapp/src/components/VoiceProfiles/ProfileForm.tsxapp/src/components/VoiceProfiles/ProfileList.tsxapp/src/lib/api/client.tsapp/src/lib/api/types.tsapp/src/lib/constants/languages.tsapp/src/lib/hooks/useGenerationForm.tsapp/src/lib/queryClient.tsapp/src/main.tsxapp/src/stores/serverStore.tsapp/src/stores/uiStore.tsbackend/backends/__init__.pybackend/backends/kokoro_backend.pybackend/backends/qwen_custom_voice_backend.pybackend/build_binary.pybackend/config.pybackend/database/migrations.pybackend/database/models.pybackend/database/seed.pybackend/models.pybackend/requirements.txtbackend/routes/audio.pybackend/routes/effects.pybackend/routes/generations.pybackend/routes/history.pybackend/routes/profiles.pybackend/services/export_import.pybackend/services/generation.pybackend/services/history.pybackend/services/profiles.pybackend/services/stories.pybackend/services/versions.pybackend/voicebox-server.specdocs/content/docs/developer/tts-engines.mdxdocs/notes/PROJECT_STATUS.mddocs/plans/API_REFACTOR_PLAN.mdtauri/src-tauri/src/main.rs
✅ Files skipped from review due to trivial changes (1)
- docs/content/docs/developer/tts-engines.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
- app/src/stores/serverStore.ts
- app/src/main.tsx
| // Auto-switch engine if profile has a default | ||
| if (selectedProfile?.default_engine) { | ||
| form.setValue( | ||
| 'engine', | ||
| selectedProfile.default_engine as | ||
| | 'qwen' | ||
| | 'luxtts' | ||
| | 'chatterbox' | ||
| | 'chatterbox_turbo' | ||
| | 'tada' | ||
| | 'kokoro', | ||
| ); |
There was a problem hiding this comment.
Missing qwen_custom_voice in the engine type cast.
The type assertion on lines 136-142 omits qwen_custom_voice, which is a valid default_engine value per the backend model. While this won't cause runtime errors, it creates a type inconsistency.
🐛 Add the missing engine type
form.setValue(
'engine',
selectedProfile.default_engine as
| 'qwen'
+ | 'qwen_custom_voice'
| 'luxtts'
| 'chatterbox'
| 'chatterbox_turbo'
| 'tada'
| 'kokoro',
);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/Generation/FloatingGenerateBox.tsx` around lines 132 -
143, The type assertion for selectedProfile?.default_engine used when
auto-switching the engine (the form.setValue('engine',
selectedProfile.default_engine as ... ) in FloatingGenerateBox) omits
'qwen_custom_voice'; update that union type to include 'qwen_custom_voice' so
the cast matches backend values (i.e., add 'qwen_custom_voice' to the union of
engine literal types used in the form.setValue call).
| useEffect(() => { | ||
| if (!selectedProfile) { | ||
| return; | ||
| } | ||
|
|
||
| if (selectedProfile.language) { | ||
| form.setValue('language', selectedProfile.language as LanguageCode); | ||
| } | ||
|
|
||
| const preferredEngine = selectedProfile.default_engine || selectedProfile.preset_engine; | ||
| if (preferredEngine) { | ||
| applyEngineSelection(form, getEngineSelectValue(preferredEngine)); | ||
| } | ||
| }, [form, selectedProfile]); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and read the target file
find . -type f -name "GenerationForm.tsx" | head -5Repository: jamiepine/voicebox
Length of output: 112
🏁 Script executed:
# Get the file size to determine how to read it
wc -l app/src/components/Generation/GenerationForm.tsx 2>/dev/null || echo "File not found in expected location"Repository: jamiepine/voicebox
Length of output: 114
🏁 Script executed:
# Search for the file more broadly if needed
fd -type f -name "*GenerationForm*"Repository: jamiepine/voicebox
Length of output: 232
🏁 Script executed:
# Read the entire file to understand the full context
cat -n app/src/components/Generation/GenerationForm.tsxRepository: jamiepine/voicebox
Length of output: 9662
🏁 Script executed:
# Find and examine the useProfile hook
find . -type f -name "*useProfiles*" -o -name "*useProfile*" | grep -E "\.(ts|tsx)$"Repository: jamiepine/voicebox
Length of output: 96
🏁 Script executed:
# Also search for the hook import location
rg "export.*useProfile" --type ts --type tsxRepository: jamiepine/voicebox
Length of output: 89
🏁 Script executed:
# Read the useProfiles hook
cat -n app/src/lib/hooks/useProfiles.tsRepository: jamiepine/voicebox
Length of output: 6294
🏁 Script executed:
# Search for query invalidation patterns related to profiles
rg "invalidate.*profile|profile.*invalidate" -i --no-headingRepository: jamiepine/voicebox
Length of output: 1555
🏁 Script executed:
# Look for context where server operations might trigger invalidations
# Search for any "server switches" or similar operations mentioned in comments
rg "server.*switch|switch.*server" -i -B 2 -A 2Repository: jamiepine/voicebox
Length of output: 822
🏁 Script executed:
# Check if there are any form state interactions that might be affected
# Search for form.setValue calls to understand form manipulation patterns
rg "form\.setValue|applyEngineSelection" -B 1 -A 1Repository: jamiepine/voicebox
Length of output: 8047
Change effect dependency to selectedProfile?.id to prevent resetting form on query refetches.
The effect depends on the full selectedProfile object, which becomes a new reference whenever the query refetches after invalidation (e.g., during server operations). This causes the effect to rerun unnecessarily, calling form.setValue and applyEngineSelection and overwriting the user's in-progress changes.
🛠 Proposed fix
-import { useEffect } from 'react';
+import { useEffect, useRef } from 'react';
…
export function GenerationForm() {
const selectedProfileId = useUIStore((state) => state.selectedProfileId);
const { data: selectedProfile } = useProfile(selectedProfileId || '');
+ const syncedProfileIdRef = useRef<string | null>(null);
const { form, handleSubmit, isPending } = useGenerationForm();
useEffect(() => {
if (!selectedProfile) {
+ syncedProfileIdRef.current = null;
return;
}
+ if (syncedProfileIdRef.current === selectedProfile.id) {
+ return;
+ }
+ syncedProfileIdRef.current = selectedProfile.id;
if (selectedProfile.language) {
form.setValue('language', selectedProfile.language as LanguageCode);
@@
if (preferredEngine) {
applyEngineSelection(form, getEngineSelectValue(preferredEngine));
}
- }, [form, selectedProfile]);
+ }, [form, selectedProfile?.id]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/Generation/GenerationForm.tsx` around lines 43 - 56, The
effect currently depends on the whole selectedProfile object which causes reruns
when the query refetches; update the dependency array to depend only on
selectedProfile?.id (and form) so the effect only runs when the selected profile
identity changes; keep the same logic inside the effect (calls to form.setValue
and applyEngineSelection/getEngineSelectValue) and reference selectedProfile.id
when reading preferred_engine/default_engine and language to avoid using the
full object as a dependency.
| {profile.voice_type === 'preset' && ( | ||
| <Badge variant="secondary" className="text-xs h-5 px-1.5"> | ||
| {ENGINE_DISPLAY_NAMES[profile.preset_engine ?? ''] ?? profile.preset_engine} | ||
| </Badge> | ||
| )} |
There was a problem hiding this comment.
Guard the preset badge on preset_engine.
This branch renders a secondary badge for every preset profile, even when preset_engine is missing. That leaves an empty badge for legacy or partially migrated preset records.
Suggested change
- {profile.voice_type === 'preset' && (
+ {profile.voice_type === 'preset' && profile.preset_engine && (
<Badge variant="secondary" className="text-xs h-5 px-1.5">
- {ENGINE_DISPLAY_NAMES[profile.preset_engine ?? ''] ?? profile.preset_engine}
+ {ENGINE_DISPLAY_NAMES[profile.preset_engine] ?? profile.preset_engine}
</Badge>
)}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/VoiceProfiles/ProfileCard.tsx` around lines 106 - 110, The
preset badge is rendered for all profiles with profile.voice_type === 'preset'
even when profile.preset_engine is missing, producing an empty badge; in
ProfileCard update the conditional to also require a non-empty
profile.preset_engine (e.g. profile.voice_type === 'preset' &&
profile.preset_engine) before rendering the Badge and use
ENGINE_DISPLAY_NAMES[profile.preset_engine] ?? profile.preset_engine as the
label so legacy/partially migrated records without preset_engine do not show an
empty badge.
| onClick={() => { | ||
| setSelectedPresetVoiceId(voice.voice_id); | ||
| // Auto-set language from voice | ||
| if (voice.language) { | ||
| form.setValue('language', voice.language as LanguageCode); | ||
| } | ||
| }} |
There was a problem hiding this comment.
Type cast voice.language as LanguageCode may fail validation.
If voice.language from the API doesn't match the LanguageCode type definition exactly, this cast could set an invalid value. Consider adding validation or using a lookup.
🛡️ Suggested safer approach
onClick={() => {
setSelectedPresetVoiceId(voice.voice_id);
// Auto-set language from voice
- if (voice.language) {
- form.setValue('language', voice.language as LanguageCode);
+ if (voice.language && LANGUAGE_CODES.includes(voice.language as LanguageCode)) {
+ form.setValue('language', voice.language as LanguageCode);
}
}}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/VoiceProfiles/ProfileForm.tsx` around lines 875 - 881, The
code blindly casts voice.language to LanguageCode in the onClick handler
(setSelectedPresetVoiceId / form.setValue), which can inject invalid values;
instead validate voice.language against your canonical LanguageCode set (e.g.,
an allowedLanguages array or an isValidLanguageCode helper) and only call
form.setValue('language', validatedValue) when it matches, otherwise fall back
to a safe default or omit setting the field; update the onClick flow to perform
this lookup/validation before setting the form value to avoid invalid casting.
| speaker = voice_prompt.get("preset_voice_id") or QWEN_CV_DEFAULT_SPEAKER | ||
|
|
||
| def _generate_sync(): | ||
| if seed is not None: | ||
| torch.manual_seed(seed) | ||
| if torch.cuda.is_available(): | ||
| torch.cuda.manual_seed(seed) | ||
|
|
||
| lang_name = LANGUAGE_CODE_TO_NAME.get(language, "auto") | ||
|
|
||
| kwargs = { | ||
| "text": text, | ||
| "language": lang_name.capitalize() if lang_name != "auto" else "Auto", | ||
| "speaker": speaker, | ||
| } | ||
|
|
||
| # Only pass instruct if non-empty | ||
| if instruct: | ||
| kwargs["instruct"] = instruct | ||
|
|
||
| wavs, sample_rate = self.model.generate_custom_voice(**kwargs) | ||
| return wavs[0], sample_rate |
There was a problem hiding this comment.
Validate preset speaker and guard empty generation output.
At Line 186, unvalidated preset_voice_id can propagate invalid speaker values into generation; at Line 207, wavs[0] assumes non-empty output and can fail with an unhelpful exception.
🔧 Proposed fix
+VALID_QWEN_CV_SPEAKERS = {speaker_id for speaker_id, *_ in QWEN_CUSTOM_VOICES}
+
async def generate(
@@
- speaker = voice_prompt.get("preset_voice_id") or QWEN_CV_DEFAULT_SPEAKER
+ speaker = voice_prompt.get("preset_voice_id") or QWEN_CV_DEFAULT_SPEAKER
+ if speaker not in VALID_QWEN_CV_SPEAKERS:
+ logger.warning(
+ "Unknown preset speaker '%s'; falling back to '%s'",
+ speaker,
+ QWEN_CV_DEFAULT_SPEAKER,
+ )
+ speaker = QWEN_CV_DEFAULT_SPEAKER
@@
- wavs, sample_rate = self.model.generate_custom_voice(**kwargs)
- return wavs[0], sample_rate
+ wavs, sample_rate = self.model.generate_custom_voice(**kwargs)
+ if not wavs:
+ raise RuntimeError("Qwen CustomVoice returned no audio")
+ return wavs[0], sample_rate🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/backends/qwen_custom_voice_backend.py` around lines 186 - 207, The
preset speaker value from voice_prompt.get("preset_voice_id") may be invalid and
the call to self.model.generate_custom_voice can return an empty wavs list; in
_generate_sync validate the speaker before use (e.g., check that speaker is in
the allowed voices list or matches expected format and if not, set speaker =
QWEN_CV_DEFAULT_SPEAKER) and after calling self.model.generate_custom_voice
check that wavs is non-empty before using wavs[0] — if empty, raise a clear
exception or return a defined fallback (and include context like the speaker and
kwargs in the error). Ensure these checks live inside _generate_sync so
generate_custom_voice is always invoked with a valid speaker and its output is
safely handled.
| def _path_relative_to_any_data_dir(path: Path) -> Path | None: | ||
| """Extract the path within a data dir from an absolute or relative path.""" | ||
| parts = path.parts | ||
| for idx, part in enumerate(parts): | ||
| if part != "data": | ||
| continue | ||
|
|
||
| tail = parts[idx + 1 :] | ||
| if tail: | ||
| return Path(*tail) | ||
| return Path() | ||
|
|
||
| return None |
There was a problem hiding this comment.
Edge case: empty Path() return may cause unexpected behavior.
When the path is exactly "data" (line 35), returning Path() (empty path) could cause issues. If this is later joined with _data_dir, the result is _data_dir itself, which may not be the intended behavior.
🛡️ Consider handling this edge case explicitly
for idx, part in enumerate(parts):
if part != "data":
continue
tail = parts[idx + 1 :]
if tail:
return Path(*tail)
- return Path()
+ return Path(".") # Explicitly represent "data dir root"
return NoneOr document this behavior if intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/config.py` around lines 25 - 37, The function
_path_relative_to_any_data_dir currently returns Path() when the input path is
exactly "data", which can silently produce a joined path equal to the data dir;
change the behavior so that when no tail exists (i.e., the path equals "data")
the function returns None instead of Path(), and update the docstring to state
that a None return means "path refers to the data directory itself" (also audit
callers of _path_relative_to_any_data_dir to handle None appropriately).
| if stored_path.is_absolute(): | ||
| rebased_path = _path_relative_to_any_data_dir(stored_path) | ||
| if rebased_path is not None: | ||
| candidate = (_data_dir / rebased_path).resolve() | ||
| if candidate.exists() or not stored_path.exists(): | ||
| return candidate | ||
|
|
||
| return stored_path |
There was a problem hiding this comment.
Security consideration: absolute path fallback returns original path.
When an absolute stored path cannot be rebased but the file exists at the original location (line 90), the function returns that path. This could inadvertently serve files outside the configured _data_dir if old absolute paths were persisted in the database.
Consider whether this fallback is necessary for migration or if it should raise an error/warning for paths outside the data directory.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/config.py` around lines 83 - 90, The current logic in the
absolute-path branch (using stored_path, rebased_path and
_path_relative_to_any_data_dir) falls back to returning the original absolute
stored_path when rebasing fails, which can expose files outside _data_dir;
change this so that if rebased_path is None you do NOT return stored_path even
if it exists — instead log a warning/error and raise an exception (e.g.,
ValueError or a custom SecurityError) or return None to force explicit
migration, ensuring only candidate paths under _data_dir (constructed from
_data_dir / rebased_path and checked with candidate.exists()) are accepted;
update any callers to handle the new failure mode.
| import io | ||
| import json as _json | ||
| import logging | ||
| import tempfile | ||
| from datetime import datetime | ||
| from pathlib import Path |
There was a problem hiding this comment.
Restore the datetime import.
Line 358 still calls datetime.utcnow(). With the current import block, PUT /profiles/{profile_id}/effects will raise NameError instead of saving the updated chain.
🛠 Proposed fix
+from datetime import datetime
import io
import json as _json
import logging
import tempfile📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import io | |
| import json as _json | |
| import logging | |
| import tempfile | |
| from datetime import datetime | |
| from pathlib import Path | |
| from datetime import datetime | |
| import io | |
| import json as _json | |
| import logging | |
| import tempfile | |
| from pathlib import Path |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/routes/profiles.py` around lines 3 - 7, The module is missing the
datetime import used by the PUT /profiles/{profile_id}/effects handler (it calls
datetime.utcnow()); add back a direct import like "from datetime import
datetime" into the top-level import block in profiles.py so datetime.utcnow()
resolves, keeping the existing imports (e.g., json as _json) unchanged.
Summary
Fixes #312
Problem 1: Startup loop with external server
The Rust
start_servercommand checks if a process named "voicebox" is listening on port 17493. When users start the backend externally (aspython,uvicorn, etc.), the process name check fails. On Windows this returns an immediate error; on Unix it falls through and tries to spawn a sidecar on an occupied port. Either way,startServer()rejects andserverReadyis never set totrue, leaving the user permanently stuck on the loading screen.Fix (two layers):
/healthbefore giving up. If it responds with a valid Voicebox health payload, reuse the server.startServer()fails, fall back to polling/healthevery 2s (up to 2 minutes). This covers edge cases where the Rust detection fails but the server is reachable.Problem 2: No data refresh on server switch
When changing the server URL in Settings, only the Zustand store value was updated. React Query caches (profiles, history, stories, models) were keyed without
serverUrland retained stale data from the previous server for up to 5 minutes.Fix:
setServerUrl()now callsqueryClient.invalidateQueries()when the URL actually changes, forcing all queries to refetch from the new server.Changes
tauri/src-tauri/src/main.rscheck_health()function; use it as fallback on both Unix and Windows when process name doesn't matchapp/src/App.tsxstartServer()failsapp/src/stores/serverStore.tsapp/src/main.tsxqueryClientfor store-level cache invalidationSummary by CodeRabbit
New Features
Bug Fixes
Improvements
Documentation