Feature/voice agent by Co-vengers · Pull Request #386 · GetBindu/Bindu

Co-vengers · 2026-03-22T09:32:06Z

Voice Agent Extension — Progress & Documentation PR

Overview

This PR introduces the initial implementation of the Voice Agent Extension for Bindu, enabling real-time voice conversations between users and agents. The extension integrates backend, frontend, and testing components, following the architecture and plan outlined in docs/VOICE_AGENT_PLAN.md.

What’s Included

Backend

New voice extension module: bindu/extensions/voice/ with:
- __init__.py, voice_agent_extension.py, service_factory.py, pipeline_builder.py, session_manager.py, agent_bridge.py, audio_config.py
Endpoints: bindu/server/endpoints/voice_endpoints.py (REST + WebSocket)
Settings: bindu/settings.py updated with VoiceSettings
App integration: bindu/server/applications.py updated for conditional voice route registration and session manager
Capabilities: bindu/utils/capabilities.py updated for voice extension helpers
Penguin integration: bindu/penguin/bindufy.py updated to accept voice config and add the extension

Frontend

Voice UI and client:
- frontend/src/lib/services/voice-client.ts: WebSocket client, audio capture/playback
- frontend/src/lib/stores/voice.ts: Svelte stores for voice state and transcripts
- frontend/src/lib/components/voice/VoiceCallPanel.svelte, VoiceCallButton.svelte, LiveTranscript.svelte: UI components for voice session
Integration: Existing chat and agent message handler files updated for voice support

Tests

Unit tests for all major backend components:
- tests/unit/extensions/voice/test_voice_extension.py
- tests/unit/extensions/voice/test_session_manager.py
- tests/unit/extensions/voice/test_service_factory.py
- tests/unit/extensions/voice/test_agent_bridge.py
- tests/unit/extensions/voice/test_voice_endpoints.py

Examples & Docs

Example agent: examples/voice-agent/main.py, .env.example, and README.md
Plan: docs/VOICE_AGENT_PLAN.md (implementation plan)

Current Progress

All major backend, frontend, and test files are present and staged.
Integration into the main app and settings is in progress.
Endpoints and frontend integration are actively being refined.
Unit tests for the extension and its components are included.
Example agent and configuration are provided.
Documentation plan is present; full user-facing docs (docs/VOICE.md) are planned.

How to Test

Install dependencies:
- Ensure pipecat-ai[deepgram,elevenlabs,silero] and websockets are installed (see pyproject.toml voice group).
Set environment variables:
- VOICE__STT_API_KEY, VOICE__TTS_API_KEY (see .env.example)
Run backend tests:
- uv run pytest tests/unit/extensions/voice/ -v
Run frontend:
- Start the Svelte frontend and verify the voice call UI appears for voice-enabled agents.
Manual E2E:
- Start a voice session from the UI, speak, and verify agent responses and transcripts.
Check task persistence:
- After a session, verify conversation history via GET /tasks/get.

Next Steps & Improvements

Complete and verify all items in the implementation plan checklist (see docs/VOICE_AGENT_PLAN.md)
Finalize and publish user documentation (docs/VOICE.md)
Polish frontend UI/UX and error handling
Expand test coverage (integration, E2E, edge cases)
Lint and format: uv run pre-commit run --all-files
Optimize session cleanup and resource management
Add more example agents and configuration scenarios
Prepare for future extensions (telephony, WebRTC, multi-language, etc.)

References

Contributors:

@Co-vengers

For questions or feedback, please comment on this PR.

…Bindu#353) Worker accessed task_operation["_current_span"] but scheduler now sends primitive trace_id/span_id strings. Add _reconstruct_span() helper to rebuild a NonRecordingSpan from hex-encoded IDs with graceful fallback.

Replace math.inf buffer size with a constant of 100 to prevent unbounded memory growth while still allowing task enqueue before the worker loop is ready.

Add SpanContext, TraceFlags, NonRecordingSpan, and INVALID_SPAN_CONTEXT mocks. Register opentelemetry.trace.span submodule so worker imports resolve in the test environment.

…audio config, pipeline builder, service factory, session manager, extension class)

… manager

…sions with tests

…ent capabilities

…ests

…ript integration

… state handling

…ipt display

… mic capture

…ript management

chandan-1427 · 2026-03-23T13:11:45Z

Hey, thanks for working on adding voice support — really appreciate the effort here.

I went through the implementation and there are a few areas we’ll need to address before merging:

Multi-worker compatibility: The current session handling relies on a local store, which won’t work reliably with Uvicorn’s multi-worker setup. We’ll need to move this to a centralized solution (e.g., Redis) to avoid state inconsistencies.
Transport & latency: The current flow is based on HTTP requests. For voice interactions, we should aim for a real-time streaming approach (like WebSockets or SSE) to reduce latency and improve responsiveness.
Base branch alignment: It looks like this was built on an older version of Bindu. There are conflicts with recent changes, so rebasing onto the latest main would help before proceeding.

Looking forward to the update!

Co-vengers added 30 commits March 20, 2026 21:03

fix(scheduler): replace unbounded stream buffer with bounded limit

8cfc452

Replace math.inf buffer size with a constant of 100 to prevent unbounded memory growth while still allowing task enqueue before the worker loop is ready.

test: add opentelemetry.trace.span stubs for NonRecordingSpan imports

95cd021

Add SpanContext, TraceFlags, NonRecordingSpan, and INVALID_SPAN_CONTEXT mocks. Register opentelemetry.trace.span submodule so worker imports resolve in the test environment.

chore(voice): register voice extension in extensions module

5fa4b9e

feat(voice): add backend voice extension module (init, agent bridge, …

7473a6f

…audio config, pipeline builder, service factory, session manager, extension class)

feat(bindufy): support voice extension config in agent creation

407661f

feat(server): add conditional voice endpoint registration and session…

9a894ef

… manager

feat(voice-endpoints): add REST and WebSocket endpoints for voice ses…

8c5031b

…sions with tests

feat(settings): add VoiceSettings and extension config with tests

0ca9a7e

feat(capabilities): add helper for extracting voice extension from ag…

2acf984

…ent capabilities

feat(service-factory): add service factory for STT/TTS with tests

b41b8f4

feat(session-manager): add session manager for voice sessions with tests

f0a69e0

feat(agent-bridge): add agent bridge processor for STT↔A2A↔TTS with t…

ba77468

…ests

feat(frontend): update ChatInput.svelte for voice session integration

2be8f1b

feat(frontend): update ChatWindow.svelte for voice overlay and transc…

65183c6

…ript integration

feat(frontend): add voice MIME type support in constants

6d23566

feat(frontend): update chat store for voice session state integration

2cf230c

feat(frontend): update agent message handler for voice transcript and…

79cfd7d

… state handling

feat(frontend): add LiveTranscript.svelte for real-time voice transcr…

1c56128

…ipt display

feat(frontend): add VoiceCallButton.svelte for starting voice sessions

50ef596

feat(frontend): add VoiceCallPanel.svelte for voice call overlay UI

971b682

feat(frontend): add voice-client.ts for WebSocket audio transport and…

a24ad7c

… mic capture

feat(frontend): add voice.ts store for voice session state and transc…

fd231b5

…ript management

chore(utils): update __init__.py for voice extension support

8dc61b2

docs(examples): update README with voice agent example info

d8effbd

chore(deps): add voice extension dependencies to pyproject.toml

467dfc1

chore(deps): update uv.lock for voice extension dependencies

07d75f6

docs(voice): add VOICE_AGENT_PLAN.md implementation plan

170b840

feat(examples): add example voice agent and config files

f5f2c76

test(extensions): add __init__.py for extensions unit tests

1b2c532

test(voice): add __init__.py for voice extension unit tests

6b8acb4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/voice agent#386

Feature/voice agent#386
Co-vengers wants to merge 31 commits intoGetBindu:mainfrom
Co-vengers:feature/voice-agent

Co-vengers commented Mar 22, 2026

Uh oh!

chandan-1427 commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Co-vengers commented Mar 22, 2026

Voice Agent Extension — Progress & Documentation PR

Overview

What’s Included

Backend

Frontend

Tests

Examples & Docs

Current Progress

How to Test

Next Steps & Improvements

References

Uh oh!

chandan-1427 commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants