Skip to content

feat: ElevenLabs v3 support (eleven_v3 model + audio tags) #36

@OwenMcGirr

Description

@OwenMcGirr

Summary

ElevenLabs released eleven_v3, their most expressive model. It uses inline audio tags ([excited], [whispers], [laughs]) instead of SSML. The wrapper already accepts modelId: "eleven_v3" but has no awareness of the v3 markup format.

Changes Needed

1. SSML → audio tag translation in prepareText()

When model is eleven_v3 and input is SSML, translate key tags to v3 audio tags instead of stripping. Best-effort mapping:

SSML v3 audio tag
<emphasis level="strong"> [excited]
<emphasis level="reduced"> [whispers]
<break time="Xs"/> [pause]
<prosody rate="..."> strip (no direct mapping)

2. Expose new v3 request parameters

Add to ElevenLabsTTSOptions: seed, languageCode, previousText, nextText, applyTextNormalization

3. Tests

  • Synthesise with modelId: "eleven_v3" (real API call)
  • SSML input to v3 client — verify translation + no crash
  • Plain text + audio tags pass through unmodified

Out of Scope

  • Other models (eleven_flash_v2_5, eleven_turbo_v2_5) — already work
  • Azure or other engine changes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions