-
Notifications
You must be signed in to change notification settings - Fork 0
Add Respeecher TTS to the docs #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,132 @@ | ||
| --- | ||
| title: "Respeecher" | ||
| description: "Text-to-speech service using the Respeecher Space WebSocket API" | ||
| --- | ||
|
|
||
| ## Overview | ||
|
|
||
| The Respeecher Space API provides high-quality streaming text-to-speech synthesis with low latency. | ||
|
|
||
| <CardGroup cols={3}> | ||
| <Card | ||
| title="API Reference" | ||
| icon="code" | ||
| href="https://reference-server.pipecat.ai/en/latest/api/pipecat.services.respeecher.tts.html" | ||
| > | ||
| Complete API documentation and method details | ||
| </Card> | ||
| <Card | ||
| title="Respeecher Docs" | ||
| icon="book" | ||
| href="https://space.respeecher.com/docs" | ||
| > | ||
| Official Respeecher Space documentation | ||
| </Card> | ||
| <Card | ||
| title="Example Code" | ||
| icon="play" | ||
| href="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/07ad-interruptible-respeecher.py" | ||
| > | ||
| Working example with interruption handling | ||
| </Card> | ||
| </CardGroup> | ||
|
|
||
| ## Installation | ||
|
|
||
| To use Respeecher services, install the required dependencies: | ||
|
|
||
| ```bash | ||
| pip install "pipecat-ai[respeecher]" | ||
| ``` | ||
|
|
||
| You'll also need to set up your Respeecher API key as an environment variable: `RESPEECHER_API_KEY`. | ||
|
|
||
| <Tip> | ||
| Get your API key by signing up at | ||
| [Respeecher Space](https://space.respeecher.com/). | ||
| </Tip> | ||
|
|
||
| ## Frames | ||
|
|
||
| ### Input | ||
|
|
||
| - `TextFrame` - Text to synthesize into speech, subject to optional aggregation | ||
| - `TTSSpeakFrame` - Text that should be spoken immediately | ||
| - `TTSUpdateSettingsFrame` - Runtime configuration updates (e.g., voice, sampling params) | ||
| - `LLMFullResponseStartFrame` / `LLMFullResponseEndFrame` - LLM response boundaries | ||
|
|
||
| ### Output | ||
|
|
||
| - `TTSStartedFrame` - Signals start of synthesis | ||
| - `TTSAudioRawFrame` - Generated audio data chunks | ||
| - `TTSStoppedFrame` - Signals completion of synthesis | ||
| - `ErrorFrame` - Connection or processing errors | ||
|
|
||
| ## Language Support | ||
|
|
||
| Refer to the Respeecher Space [Docs](https://space.respeecher.com/docs) | ||
| for language support in different models. | ||
|
|
||
| ## Supported Sample Rates | ||
|
|
||
| All common sample rates are supported. | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 11 has a section on sample rates / output formats before the usage example. Should we have one?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think their section highlights that the integration can infer the right Eleven API's output format enum string value from the sample rate integer value set by the pipeline. Since our API just accepts the sample rate integer value directly, maybe we don't need a similar section
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess, but still someone could carelessly read our docs and wonder what sample rates we support or if you can choose that. So maybe better to make it explicit and easy for people to understand that this is not a disadvantage of us vs Eleven |
||
| ## Usage Example | ||
|
|
||
| Initialize the WebSocket service with your API key and desired voice: | ||
|
|
||
| ```python | ||
| from pipecat.services.respeecher.tts import RespeecherTTSService | ||
| import os | ||
|
|
||
| # Configure WebSocket service | ||
| tts = RespeecherTTSService( | ||
| api_key=os.getenv("RESPEECHER_API_KEY"), | ||
| voice_id="samantha", | ||
| params=RespeecherTTSService.InputParams( | ||
| sampling_params={ | ||
| # Optional sampling params overrides | ||
| # See https://space.respeecher.com/docs/api/tts/sampling-params-guide | ||
| # "temperature": 0.5 | ||
| }, | ||
| ), | ||
| ) | ||
|
|
||
| # Use in pipeline | ||
| pipeline = Pipeline([ | ||
| transport.input(), | ||
| stt, | ||
| llm, | ||
| tts, | ||
| transport.output() | ||
| ]) | ||
| ``` | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 11 puts aggregators in their pipeline. We don't need them?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cartesia doesn't. It's up to us, it's just an example
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand if users should put in aggregators or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I we want to properly read sequence of numbers with digits e.g. "1, 2, 3, 4, ..." we currently need an aggregator (otherwise our API treats "," as a deliminator for thousands). But for most cases even without aggregator the conversion is fine. |
||
|
|
||
| ### Dynamic Configuration | ||
|
|
||
| Make settings updates by pushing a `TTSUpdateSettingsFrame` for the `RespeecherTTSService`: | ||
|
|
||
| ```python | ||
| from pipecat.frames.frames import TTSUpdateSettingsFrame | ||
|
|
||
| await task.queue_frame( | ||
| TTSUpdateSettingsFrame(settings={"voice": "your-new-voice-id", "sampling_params": {"temperature": 0.5}}) | ||
| ) | ||
Kharacternyk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| ## Metrics | ||
|
|
||
| This service provides: | ||
|
|
||
| - **Time to First Byte (TTFB)** - Latency from text input to first audio | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be neat to measure TTFB for Respeecher and various competitors |
||
| - **Processing Duration** - Total synthesis time | ||
| - **Usage Metrics** - Character count and synthesis statistics | ||
|
|
||
| <Info> | ||
| [Learn how to enable Metrics](/guides/fundamentals/metrics) in your Pipeline. | ||
| </Info> | ||
|
|
||
| ## Additional Notes | ||
|
|
||
| - **Connection Management**: WebSocket lifecycle is handled automatically with reconnection support | ||
| - **Sample Rate**: Set globally in `PipelineParams` rather than per-service for consistency | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These cards are not formatting properly when you view the page in Github and links are not working

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It renders fine in the docs preview (
pnpx mint dev). The first and the third link are broken since the PRs haven't been merged upstream yetThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, but why would the rendering be broken in Github? It's not broken for other services
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the reason, but my gut feeling is that it's nothing important. If we indeed have some syntax errors, maybe the upstream reviewers will point them out