feat(engine): implement dynamic VAD thresholds for playback echo supp… by jjleng · Pull Request #52 · ferosai/feros

jjleng · 2026-04-16T18:40:04Z

…ression

jjleng · 2026-04-16T18:40:53Z

claude · 2026-04-16T18:41:10Z

Claude finished @jjleng's task in 1m 37s —— View job

Code Review

Gather context
Read changed files
Analyze and post review

Overall this is a clean, well-motivated change. The design decisions are sound and the code is clear. A few observations below, ordered by importance.

What's well done

Named constants over magic numbers — VAD_THRESHOLD_IDLE/PLAYBACK/PLAYBACK_RAW with doc-comments explaining why each value is chosen is much better than the previous hardcoded 0.7. Future tuning will be easy.
Packet-level threshold decision — setting the threshold once before process_frames() rather than once per frame is correct and efficient.
Dedup guard in VadStage::set_threshold — the if self.inner.threshold() != threshold check avoids unnecessary writes on every packet. Good micro-optimization.
Path-specific thresholds — distinguishing denoised (0.85) vs raw (0.90) paths is architecturally correct; the doc-comments make the reasoning explicit.

Observations

1. Mid-utterance threshold elevation

[reactor/audio.rs:36-40] If the bot starts playing audio while the user is already speaking, the threshold jumps from 0.70 → 0.85 mid-utterance. Frames where the VAD probability falls in the [0.70, 0.85) range will now register as silence, incrementing silence_frame_count. After 6 frames (~192ms) the VAD would emit a premature SpeechEnd, silently dropping the barge-in.

Whether this is acceptable depends on how often a bot starts playback while the user is mid-speech. If that scenario occurs (e.g., re-engagement prompts) it could cause silent barge-in failures that are hard to debug. A conservative guard:

// Only elevate threshold between utterances — mid-speech, keep the gate open.
if !self.vad.inner.is_speaking() {
    self.vad.set_threshold(if is_playing { ... } else { ... });
}

…though that would need a public is_speaking() on VadStage. Alternatively, documenting the known behaviour is sufficient if this scenario is considered unlikely. Fix this →

2. Denoiser-off edge case

[reactor/audio.rs:37] When denoise_enabled == false, audio reaches VAD unfiltered — the same situation as the native multimodal path — yet the Reactor path still uses VAD_THRESHOLD_PLAYBACK = 0.85 instead of VAD_THRESHOLD_PLAYBACK_RAW = 0.90. If echo suppression in raw audio was the motivation for the higher raw threshold, this path should arguably use VAD_THRESHOLD_PLAYBACK_RAW when the denoiser is disabled. Worth considering, even if it's an uncommon config.

3. set_min_volume() is added but not wired up

Both SileroVad and VadStage gain set_min_volume() accessors, but nothing in this PR calls them during playback switching. Raising min_volume during playback would provide a second layer of echo gating (amplitude-level, before the neural network). If this is deferred intentionally, a small comment like // reserved for future playback gating would prevent confusion.

4. No tests for threshold switching

SileroVad::process_with_prob was clearly designed for unit testing without the ONNX model. A simple test asserting that VadStage::set_threshold changes detection behavior (e.g., prob=0.80 → SpeechStart at idle threshold, but silent at playback threshold) would meaningfully increase confidence here. Low effort given the existing test scaffolding.

Minor nits

[reactor/audio.rs:27-31] The new is_playing comment block is inserted mid-way through the existing SAFETY NOTE comment, breaking its flow. Suggest moving the is_playing block and its comment after the safety note ends (after the closing paren on line 26 in the original).
[reactor/proc/vad.rs:22] float != float is correct here since values come from named constants, but worth noting if this comparison is ever extended to computed thresholds.

jjleng · 2026-04-16T18:41:23Z

#51

…ression

jjleng · 2026-04-16T19:09:23Z

@claude

claude · 2026-04-16T19:09:40Z

Claude finished @jjleng's task in 1m 50s —— View job

Unable to access issue #51

Gather context
Read changed files
Read issue Dynamic VAD Thresholding to Prevent False Barge-ins During Playback #51 — blocked (see below)

I've reviewed all five changed files in this PR, but I'm unable to access issue #51 — both gh issue view and web fetch are blocked by the current tool permissions.

To unblock me, you can either:

Grant gh CLI read permissions — add gh issue view * to the project's .claude/settings.json allowlist (or run /less-permission-prompts to auto-generate an allowlist), or
Paste the issue body here — copy the issue title + description into a reply so I can act on it directly.

Once I can read the issue I'll know exactly what to implement or review.

jjleng linked an issue Apr 16, 2026 that may be closed by this pull request

Dynamic VAD Thresholding to Prevent False Barge-ins During Playback #51

Closed

feat(engine): implement dynamic VAD thresholds for playback echo supp…

2263b2c

…ression

jjleng force-pushed the dynamic-vad branch from 9a33a97 to 2263b2c Compare April 16, 2026 19:08

jjleng merged commit 039fde0 into main Apr 16, 2026
1 check passed

jjleng deleted the dynamic-vad branch April 16, 2026 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(engine): implement dynamic VAD thresholds for playback echo supp…#52

feat(engine): implement dynamic VAD thresholds for playback echo supp…#52
jjleng merged 1 commit intomainfrom
dynamic-vad

jjleng commented Apr 16, 2026

Uh oh!

jjleng commented Apr 16, 2026

Uh oh!

claude bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

jjleng commented Apr 16, 2026

Uh oh!

jjleng commented Apr 16, 2026

Uh oh!

claude bot commented Apr 16, 2026 •

edited by jjleng

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jjleng commented Apr 16, 2026

Uh oh!

jjleng commented Apr 16, 2026

Uh oh!

claude bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

What's well done

Observations

Minor nits

Uh oh!

jjleng commented Apr 16, 2026

Uh oh!

jjleng commented Apr 16, 2026

Uh oh!

claude bot commented Apr 16, 2026 • edited by jjleng Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unable to access issue #51

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Apr 16, 2026 •

edited

Loading

claude bot commented Apr 16, 2026 •

edited by jjleng

Loading