feat: support custom audio track input#29
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for custom audio track input to the Anam SDK, enabling developers to provide their own audio source for speech-to-text processing instead of relying on the default microphone input.
Changes:
- Added
audio_input_trackparameter toClientOptionsfor configuring custom audio input - Updated the WebRTC connection setup to use the custom audio track when provided
- Exported
AudioStreamTrackandAudioFrameclasses from the main package for easier access
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/anam/types.py | Added audio_input_track field to ClientOptions with documentation |
| src/anam/client.py | Passed audio_input_track option to the streaming client |
| src/anam/_streaming.py | Implemented custom audio track handling in WebRTC setup and cleanup |
| src/anam/init.py | Exported AudioStreamTrack and AudioFrame for public API access |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
… buffer on recv, no flush on connect
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/anam/_streaming.py">
<violation number="1" location="src/anam/_streaming.py:720">
P2: The upper-bound check uses 4800 instead of 48000, causing the warning to fire for all valid sample rates above 4.8 kHz. This makes the warning misleading and noisy for normal inputs.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
…python-sdk into feat/support-audio-track-input
robbie-anam
left a comment
There was a problem hiding this comment.
This looks good to me but on the overall purpose,just clarifying that this is "user audio" e.g. like from a mic etc.
Just asking as if it's tts audio will need to go over websocket.
Summary by cubic
Adds support for sending user audio via a new send_user_audio(...) API that takes raw 16-bit PCM. Default behavior is unchanged when no audio is sent.
New Features
Refactors
Written for commit a63666a. Summary will update on new commits.