Implement OpenAI Text-to-Speech Conversion Model#126
Open
caseymanos wants to merge 4 commits intoWordPress:trunkfrom
Open
Implement OpenAI Text-to-Speech Conversion Model#126caseymanos wants to merge 4 commits intoWordPress:trunkfrom
caseymanos wants to merge 4 commits intoWordPress:trunkfrom
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Move voice parameter handling to follow codebase philosophy: SDK validates structure/format, API validates business rules. The abstract class no longer sets a default voice, allowing the OpenAI API to return a clear error if voice is not configured. This makes the TTS implementation consistent with text generation and image generation models.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
First OSS contribution! Tried to follow all the contribution guidelines and code checks I could find.
Resolves #90
This implements text-to-speech conversion support for OpenAI, enabling the
SDK to convert text input to audio output using OpenAI's TTS API.
Changes
OpenAI-compatible TTS providers
Supported Features
shimmer, verse
Implementation Notes
The abstract base class follows the same pattern as
AbstractOpenAiCompatibleImageGenerationModel, making it reusable for other
providers that implement OpenAI-compatible TTS endpoints.
Required API parameters such as
voice must be explicitly configured, and the API returns clear validation
errors if omitted. This keeps the abstract class clean for other
OpenAI-compatible providers that may have different voice options.
The TTS API returns binary audio
data directly. The implementation handles this by base64-encoding the
response and wrapping it in a File object within the standard
GenerativeAiResult structure. Would like more review on this approach in
particular.