Skip to content

feat: implement TalkingHead renderer with image/audio props#189

Merged
SecurityQQ merged 1 commit intomainfrom
feature/talking-head-renderer
Mar 31, 2026
Merged

feat: implement TalkingHead renderer with image/audio props#189
SecurityQQ merged 1 commit intomainfrom
feature/talking-head-renderer

Conversation

@SecurityQQ
Copy link
Copy Markdown
Contributor

Summary

  • Implements the <TalkingHead> component renderer that was previously defined but never wired into the rendering pipeline
  • New props: image (accepts VargElement<"image"> or ResolvedElement<"image">) and audio (accepts VargElement<"speech"> or ResolvedElement<"speech">)
  • Pipeline: resolves image + speech in parallel → generates lipsync video via model (e.g., sync-v2-pro)
  • Supports both pre-resolved (awaited) and lazy (non-awaited) elements

Changes

File Change
src/react/types.ts Replaced character/src/voice/children props with image/audio props on TalkingHeadProps
src/react/elements.ts Made TalkingHead thenable/awaitable via makeThenable
src/react/renderers/talking-head.ts New renderer: image → speech → lipsync pipeline
src/react/renderers/clip.ts Added case "talking-head": handler in clip switch
src/react/resolve.ts Added resolveTalkingHeadElement() for standalone await TalkingHead()
src/react/renderers/talking-head.test.ts 8 tests covering element creation, clip integration, error cases, pre-resolved and lazy elements

Fixes

Fixes render jobs that use <TalkingHead> — previously the component was silently skipped during rendering, producing empty/black video output.

Usage

const character = Image({
  prompt: "young woman, casual outfit, warm smile",
  model: fal.imageModel("flux-pro"),
});

const voiceover = await Speech({
  voice: "rachel",
  model: elevenlabs.speechModel("eleven_multilingual_v2"),
  children: "Hey! Game changer."
});

export default (
  <Render width={1080} height={1920}>
    <Clip duration={voiceover.duration}>
      <TalkingHead image={character} audio={voiceover} model={fal.videoModel("sync-v2-pro")} />
      <Captions src={voiceover} style="tiktok" />
    </Clip>
  </Render>
);

Add a full rendering pipeline for the <TalkingHead> component:
- image prop accepts VargElement<image> or ResolvedElement<image>
- audio prop accepts VargElement<speech> or ResolvedElement<speech>
- Resolves image + speech in parallel, then generates lipsync video
- Wired into clip.ts switch statement as a video layer
- Made TalkingHead awaitable via makeThenable + resolveTalkingHeadElement
- Added resolution prop (480p/720p/1080p) for lipsync generation
- 8 tests covering element creation, clip integration, error cases,
  pre-resolved elements, and lazy element rendering
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8fb642e2-e222-4f2f-ae77-aadaaa6ef6e3

📥 Commits

Reviewing files that changed from the base of the PR and between 8c0a9d7 and f90f639.

📒 Files selected for processing (6)
  • src/react/elements.ts
  • src/react/renderers/clip.ts
  • src/react/renderers/talking-head.test.ts
  • src/react/renderers/talking-head.ts
  • src/react/resolve.ts
  • src/react/types.ts

📝 Walkthrough

walkthrough

new "talking-head" react element introduced with thenable async resolution. updated props structure uses image/audio varge elements instead of string inputs. resolution coordinates image+audio lipsync pipeline via video model. rendering and clip integration added with comprehensive test coverage.

changes

cohort / file(s) summary
type definitions
src/react/types.ts
TalkingHeadProps refactored: removed character, src, voice, children string props; added image? and audio? varge elements, resolution? ("480p" | "720p" | "1080p"), and lipsyncModel? override.
element & resolution
src/react/elements.ts, src/react/resolve.ts
TalkingHead element now returns thenable via makeThenable. new resolveTalkingHeadElement validates required props, resolves image+audio concurrently, generates lipsync video, returns ResolvedElement with file and duration.
rendering pipeline
src/react/renderers/talking-head.ts, src/react/renderers/clip.ts
new renderTalkingHead orchestrates image+audio resolution and delegates to video renderer. renderClipLayers extended with "talking-head" case that emits video layer with cover resize mode.
tests
src/react/renderers/talking-head.test.ts
comprehensive test suite (362 lines) validates element structure, thenable behavior, clip integration, property validation, and rendering with both pre-resolved and lazy image/audio elements.

sequence diagram

sequenceDiagram
    participant App as App
    participant Elem as TalkingHead Element
    participant Resolve as resolveTalkingHeadElement
    participant ImageRes as resolveImageProp
    participant AudioRes as resolveAudioProp
    participant VideoRender as renderVideo
    participant Render as renderTalkingHead
    participant Clip as renderClipLayers

    App->>Elem: create talking-head with image/audio
    Elem->>Elem: return thenable element
    App->>Resolve: await element resolution
    Resolve->>ImageRes: resolve image prop
    ImageRes-->>Resolve: file (pre-resolved or rendered)
    Resolve->>AudioRes: resolve audio prop (concurrent)
    AudioRes-->>Resolve: file (pre-resolved or rendered)
    Resolve->>VideoRender: create lipsync video element<br/>(image+audio → prompt)
    VideoRender-->>Resolve: video file
    Resolve-->>App: ResolvedElement with video file
    App->>Clip: render clip with talking-head
    Clip->>Render: renderTalkingHead(element, ctx)
    Render->>VideoRender: delegate to video renderer
    VideoRender-->>Render: file path
    Render-->>Clip: resolved file
    Clip-->>App: VideoLayer (cover resize, mixVolume: 1)
Loading

estimated code review effort

🎯 3 (moderate) | ⏱️ ~20 minutes

possibly related prs

poem

🎬 two inputs meet with audio grace,
image and sound find sync in space,
lipsync magic makes the head speak true,
async await brings dreams in view 🎭✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/talking-head-renderer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@SecurityQQ SecurityQQ merged commit f7dd017 into main Mar 31, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant