Fix Cloud Run deployment: bypass DDP init, use /tmp for writes #82

mirai-gpro · 2026-02-07T13:10:13Z

Root cause: defaults.py's default_setup() and default_config_parser() assume a distributed training environment with writable filesystem. On Cloud Run (read-only /app), this causes silent init failures.

Changes:

app.py: Skip default_setup() entirely, manually set CPU/single-process config
app.py: Redirect save_path to /tmp (only writable dir on Cloud Run)
app.py: Add GCS FUSE mount path resolution with Docker-baked fallback
cloudbuild.yaml: Add Cloud Storage FUSE volume mount for model serving
cloudbuild.yaml: Increase max-instances to 4
Include handoff docs and full LAM_Audio2Expression codebase

https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Root cause: defaults.py's default_setup() and default_config_parser() assume a distributed training environment with writable filesystem. On Cloud Run (read-only /app), this causes silent init failures. Changes: - app.py: Skip default_setup() entirely, manually set CPU/single-process config - app.py: Redirect save_path to /tmp (only writable dir on Cloud Run) - app.py: Add GCS FUSE mount path resolution with Docker-baked fallback - cloudbuild.yaml: Add Cloud Storage FUSE volume mount for model serving - cloudbuild.yaml: Increase max-instances to 4 - Include handoff docs and full LAM_Audio2Expression codebase https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

The LAM model file was misidentified as .tar but is actually a PyTorch weights file. Gemini renamed it to .pth on GCS. Also source wav2vec2 config.json from the model directory instead of LAM configs/. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

- Import gourmet-sp from implementation-testing branch - Add sendAudioToExpression() to shop introduction TTS flow (firstShop and remainingShops now get lip sync data before playback) - Remove legacy event hooks in concierge-controller init() (replaced with clean linkTtsPlayer helper) - Clean up LAMAvatar.astro: remove legacy frame playback code (startFramePlaybackFromQueue, stopFramePlayback, frameQueue, etc.) - Simplify to single sync mechanism: frameBuffer + ttsPlayer.currentTime - Reduce health check interval from 2s to 10s https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Using official LAM sample avatar as placeholder. Will be replaced with custom-generated avatar later. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

- Add fade-in/fade-out smoothing (6 frames / 200ms) to prevent Gaussian Splat visual distortion at speech start/end - Parallelize expression generation with TTS synthesis: remaining sentence expression is pre-fetched during first sentence playback, eliminating wait time between segments - Add fetchExpressionFrames() for background expression fetch with pendingExpressionFrames buffer swap pattern - Apply same optimization to shop introduction flow https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

sendAudioToExpression fetch could hang indefinitely (Cloud Run cold start / service down), blocking await and preventing TTS play(). - Add AbortController timeout (8s) to all expression API fetches - Wrap expression await with Promise.race so TTS plays even if expression API is slow/down (lip sync degrades gracefully) - Applied to speakTextGCP, speakResponseInChunks, and shop flow https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Root cause: sendAudioToExpression fetch hung in browser, blocking await and preventing TTS play() from ever being called. Fix: all expression API calls are now fire-and-forget - TTS playback starts immediately without waiting for expression frames. Frames arrive asynchronously and getExpressionData() picks them up in real-time from the frameBuffer. - Remove await/Promise.race from all sendAudioToExpression calls - Remove fetchExpressionFrames and pendingExpressionFrames (no longer needed - direct fire-and-forget is simpler) - Keep AbortController timeout (8s) inside sendAudioToExpression to prevent leaked connections https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

… calls Architecture change: expression frames are now returned WITH TTS audio from the backend, instead of the frontend calling audio2exp directly. Backend (app_customer_support_modified.py): - Replace fire-and-forget send_to_audio2exp with get_expression_frames that returns {names, frames, frame_rate} - Send MP3 directly to audio2exp (no separate PCM generation needed) - TTS response: {success, audio, expression: {...}} - Server-to-server communication: no CORS, stable, fast Frontend (concierge-controller.ts): - New queueExpressionFromTtsResponse() reads expression from TTS response - Remove sendAudioToExpression (direct browser→audio2exp REST calls) - Remove audio2expApiUrl, audio2expWsUrl, connectLAMAvatarWebSocket - Remove EXPRESSION_API_TIMEOUT_MS, AbortController timeout - Existing 1st-sentence-ahead pattern now automatically includes expression data (no separate API call needed) https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

…orget proxy - Backend: TTS endpoint no longer blocks on expression generation - Backend: New /api/audio2expression proxy (server-to-server, CORS-free) - Frontend: All expression calls use fireAndForgetExpression() (never blocks TTS play) - Removes ~2s first-sentence delay caused by synchronous expression in TTS https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

…aining Two bugs fixed: 1. Buffer corruption: frames from segment 1 mixed with segment 2 (ttsPlayer.currentTime resets but frameBuffer was concatenated) → Now clear buffer before each new TTS segment 2. 3-second delay: expression frames arrived after TTS started playing → Pre-fetch remaining segment's expression during first segment playback → When second segment starts, pre-fetched frames are immediately available New prefetchExpression() method returns Promise with parsed frames, applied non-blocking via .then() to never delay TTS playback. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Architecture change: backend includes expression data in TTS response (server-to-server audio2exp call ~150ms) instead of separate proxy. - Backend TTS endpoint calls audio2exp synchronously, includes result - Frontend applyExpressionFromTts(): instant buffer queue from TTS data - Proxy fireAndForgetExpression kept as fallback (timeout/error cases) - All 5 call sites (speakTextGCP, speakResponseInChunks x2, shop x2) updated - Removes prefetch complexity (TTS response already carries expression) Result: lip sync starts from frame 0, no 2-3 second gap. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Architecture redesign for true zero-delay TTS playback: - Backend TTS endpoint starts audio2exp in background thread, returns audio + expression_token immediately (no blocking) - New /api/expression/poll endpoint: frontend polls for result - Frontend pollExpression(): fire-and-forget polling at 150ms intervals - Removes sync expression, proxy, and prefetch approaches Timeline: TTS returns ~500ms, audio2exp completes ~150ms later (background), frontend first poll arrives ~200ms after TTS → expression available ~350ms after playback starts. Previous: 2-3 seconds delay or TTS blocked. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

…aster response Backend: revert to sync expression in TTS response (remove async cache/polling). Frontend: replace pollExpression with applyExpressionFromTts (sync from TTS response). Frontend: fire sendMessage() immediately while ack plays (don't await firstAckPromise). pendingAckPromise is awaited before TTS playback to prevent ttsPlayer conflict. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

…nterrupt) unlockAudioParams() does play→pause→reset on ttsPlayer for iOS unlock. When called during ack playback (parallel LLM mode), it kills the ack audio. Skip it when pendingAckPromise is active (audio already unlocked by ack). https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

…rentAudio safety Root cause: ack "はい" gets paused (not ended) by some interruption, so pendingAckPromise never resolves → speakResponseInChunks stuck forever. Fix 1: resolve pendingAckPromise on both 'ended' and 'pause' events. Fix 2: call stopCurrentAudio() after pendingAckPromise resolves to ensure ttsPlayer is clean before new TTS playback. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

- Container: max-height 650px → height calc(100dvh - 40px), max-height 960px - Avatar stage: 140px → 300px (desktop), 100px → 200px (mobile) - Chat area: min-height 150px guaranteed for message display https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Post-init camera: Z 1→0.6 (closer), Y 1.8→1.75 (slight down), FOV 50→36 (zoom in). Eliminates wasted space above avatar head in the 300px avatar-stage. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Previous: lookAt y=1.8 (head center) + tight zoom → mouth cut off at bottom. Fix: lower target to y=1.62 (nose/mouth center), adjust OrbitControls target to match. Camera Z=0.55, FOV=38 for balanced framing. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

targetY 1.62→1.66 (avatar lower in frame), camera Y 1.62→1.72 (above target, slight downward angle instead of looking up from below) https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

claude added 20 commits February 7, 2026 13:02

Add concierge.zip avatar (copy of p2-1.zip sample for testing)

14f48e5

Using official LAM sample avatar as placeholder. Will be replaced with custom-generated avatar later. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Fix avatar path: use concierge.zip instead of p2-1.zip

c8af922

https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Adjust camera to fill avatar head in canvas (reduce empty space above)

eb36e5b

Post-init camera: Z 1→0.6 (closer), Y 1.8→1.75 (slight down), FOV 50→36 (zoom in). Eliminates wasted space above avatar head in the 300px avatar-stage. https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Camera tweak: lower avatar 10% in frame + natural slight top-down angle

8752cb8

targetY 1.62→1.66 (avatar lower in frame), camera Y 1.62→1.72 (above target, slight downward angle instead of looking up from below) https://claude.ai/code/session_01C6n4TZ9PPdx46jCevmVo7P

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Cloud Run deployment: bypass DDP init, use /tmp for writes #82

Fix Cloud Run deployment: bypass DDP init, use /tmp for writes #82

Uh oh!

mirai-gpro commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix Cloud Run deployment: bypass DDP init, use /tmp for writes #82

Are you sure you want to change the base?

Fix Cloud Run deployment: bypass DDP init, use /tmp for writes #82

Uh oh!

Conversation

mirai-gpro commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants