Refactor Audio Stack and Offline Voice Agent, Implement Comprehensive Audio Testing and Diagnostics Overview#6
Conversation
Signed-off-by: chcavignx <ccubi73@gmail.com>
LiveReview Pre-Commit Check: ran (iter:7, coverage:100%)
LiveReview Pre-Commit Check: ran (iter:8, coverage:100%)
This commit introduces a comprehensive suite of integration tests for the audio library. The tests cover hardware detection, stream control, playback, and recording, aiming to validate audio functionality in deployment environments. A new README file (`AUDIO_TESTS_README.md`) provides detailed descriptions of each test, expected outputs, troubleshooting guidance, and integration strategies for CI/CD pipelines. Additionally, a quick start script (`QUICK_START_AUDIO_TESTS.sh`) and a runner script (`run_all_audio_tests.py`) are included to facilitate easy execution and management of these tests. This enhances the robustness and testability of the audio components. Signed-off-by: chcavignx <ccubi73@gmail.com> LiveReview Pre-Commit Check: vouched (iter:1, coverage:0%)
LiveReview Pre-Commit Check: ran (iter:1, coverage:0%)
LiveReview Pre-Commit Check: ran (iter:4, coverage:98%)
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
Review Summary by QodoImplement Offline Voice Agent with Comprehensive Audio Stack, Testing, and CI/CD Restructuring
WalkthroughsDescription• Implements comprehensive offline voice agent with wake word detection, automatic speech recognition (ASR), and text-to-speech (TTS) synthesis • Introduces new audio processing stack with ASREngine, TTSEngine, WakeWordDetector, and VADEngine classes supporting multiple backends (Faster-Whisper, OpenAI Whisper, Piper TTS, openWakeWord, Silero VAD) • Adds 100+ unit tests covering audio engines, configuration system, utilities, and VAD/wake word detection with comprehensive mocking • Implements 6+ end-to-end integration tests for microphone recording, audio playback, stream lifecycle, TTS-to-ASR pipeline, and sample transcription • Refactors configuration system with new ASRConfig, AudioConfig, WakeConfig, VADConfig, and PlatformConfig classes supporting centralized audio settings • Adds shared audio utilities module (audio_utils.py) with playback, validation, conversion, and error suppression helpers • Restructures CI/CD pipelines to separate ARM testing from general testing with dedicated test-raspberry-pi job and coverage reporting • Updates dependency management with uv and ruff, adds strict type-checking via basedpyright, and consolidates optional dependencies • Expands documentation with audio implementation guides, integration testing documentation, and offline voice agent architecture • Refactors example scripts and model download scripts with improved type safety, logging cleanup, and path handling • Updates pre-commit hooks with basedpyright type-checking and restructured ruff stages Diagramflowchart LR
A["Audio Input<br/>Microphone"] -->|"16kHz PCM"| B["WakeWordDetector<br/>openWakeWord"]
B -->|"Wake Event"| C["ASREngine<br/>Faster-Whisper/Whisper"]
C -->|"Transcribed Text"| D["Intent Processing<br/>Response Generation"]
D -->|"Response Text"| E["TTSEngine<br/>Piper TTS"]
E -->|"Audio Output"| F["Speaker<br/>Playback"]
B -->|"VAD Segmentation"| G["VADEngine<br/>Silero VAD"]
G -->|"Speech Segments"| C
H["Config System<br/>ASRConfig/AudioConfig"] -.->|"Settings"| B
H -.->|"Settings"| C
H -.->|"Settings"| E
File Changes1. tests/audio/test_audio_engine_units.py
|
Code Review by Qodo
1. ASR import-time dep crash
|
|
fix will be done in next commit |
|
fix in next commit |
Implement Offline Voice Agent and Refactor Audio Stack
Overview
This update introduces a new offline voice agent, integrating wake word detection, speech recognition, and text-to-speech. It significantly refactors the underlying audio processing stack with new dependencies and improved engine logic. CI/CD pipelines also shifted to support dedicated ARM testing while relaxing overall type-checking rigor.
Introduces a robust suite of integration and unit tests for audio components. It includes new diagnostic tools and updates pre-commit hooks. The focus is on verifying audio hardware interaction and engine functionality.
uv, ruff usage, refactoring the CI/CD pipelines and pyproject.toml to support that.