Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,14 +198,22 @@ let cleaned = preprocessor.process(&raw_audio);

### ONNX Model Downloads

Silero, TEN-VAD, and FireRedVAD models are downloaded automatically at build time. For offline or CI builds, point to a local model file:
Silero, TEN-VAD, and FireRedVAD models are downloaded automatically at build time. The Silero backend is pinned to **v6.2.1** by default.

For offline or CI builds, point to a local model file:

```sh
SILERO_MODEL_PATH=/path/to/silero_vad.onnx cargo build --features silero
TEN_VAD_MODEL_PATH=/path/to/ten-vad.onnx cargo build --features ten-vad
FIRERED_MODEL_PATH=/path/to/fireredvad.onnx FIRERED_CMVN_PATH=/path/to/cmvn.ark cargo build --features firered
```

To use a different Silero model version, override the download URL:

```sh
SILERO_MODEL_URL=https://github.com/snakers4/silero-vad/raw/v6.0/src/silero_vad/data/silero_vad.onnx cargo build --features silero
```

## Error Handling

All backends return `Result<f32, VadError>`. The error type covers:
Expand Down
13 changes: 9 additions & 4 deletions crates/wavekat-vad/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -63,15 +63,18 @@ fn main() {
#[cfg(feature = "silero")]
fn setup_silero_model() {
const DEFAULT_MODEL_URL: &str =
"https://github.com/snakers4/silero-vad/raw/master/src/silero_vad/data/silero_vad.onnx";
"https://github.com/snakers4/silero-vad/raw/v6.2.1/src/silero_vad/data/silero_vad.onnx";
const SILERO_MODEL_NAME: &str = "silero_vad.onnx";
// Bump this when updating the default model URL to invalidate cached downloads.
const SILERO_MODEL_VERSION: &str = "v6.2.1";

// Re-run if these env vars change
println!("cargo:rerun-if-env-changed=SILERO_MODEL_PATH");
println!("cargo:rerun-if-env-changed=SILERO_MODEL_URL");

let out_dir = env::var("OUT_DIR").expect("OUT_DIR not set");
let model_path = Path::new(&out_dir).join(SILERO_MODEL_NAME);
let version_path = Path::new(&out_dir).join("silero_vad.version");

// Option 1: Use local file if SILERO_MODEL_PATH is set
if let Ok(local_path) = env::var("SILERO_MODEL_PATH") {
Expand All @@ -91,8 +94,9 @@ fn setup_silero_model() {
return;
}

// Skip download if already exists
if model_path.exists() {
// Skip download if model exists and version matches
let cached_version = fs::read_to_string(&version_path).unwrap_or_default();
if model_path.exists() && cached_version.trim() == SILERO_MODEL_VERSION {
return;
}

Expand All @@ -111,9 +115,10 @@ fn setup_silero_model() {
.expect("failed to read model bytes");

fs::write(&model_path, &bytes).expect("failed to write model file");
fs::write(&version_path, SILERO_MODEL_VERSION).expect("failed to write version marker");

println!(
"cargo:warning=Silero VAD model downloaded to {}",
"cargo:warning=Silero VAD model ({SILERO_MODEL_VERSION}) downloaded to {}",
model_path.display()
);
}
Expand Down
8 changes: 4 additions & 4 deletions crates/wavekat-vad/src/backends/silero.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
//! Silero VAD backend using ONNX Runtime.
//!
//! This backend wraps the [Silero VAD](https://github.com/snakers4/silero-vad)
//! v5 model, a pre-trained LSTM neural network for voice activity detection.
//! v6 model, a pre-trained LSTM neural network for voice activity detection.
//! It runs inference via ONNX Runtime (through the [`ort`](https://crates.io/crates/ort)
//! crate) and returns continuous speech probability scores between 0.0 and 1.0.
//!
Expand All @@ -24,7 +24,7 @@
//!
//! # Model Loading
//!
//! The default ONNX model (Silero VAD v5) is embedded in the binary at
//! The default ONNX model (Silero VAD v6) is embedded in the binary at
//! compile time — no external files are needed at runtime. For custom
//! models, use [`SileroVad::from_file`] or [`SileroVad::from_memory`].
//!
Expand All @@ -47,7 +47,7 @@ use ndarray::{Array1, Array2, Array3};
use ort::{inputs, session::Session, value::Tensor};
use std::time::{Duration, Instant};

/// Embedded Silero VAD ONNX model (v5).
/// Embedded Silero VAD ONNX model (v6).
/// Downloaded automatically at build time by build.rs.
const MODEL_BYTES: &[u8] = include_bytes!(concat!(env!("OUT_DIR"), "/silero_vad.onnx"));

Expand All @@ -57,7 +57,7 @@ const CONTEXT_SIZE: usize = 64;
/// LSTM hidden state shape: [2, 1, 128] (h and c states).
const STATE_DIM: usize = 128;

/// Voice activity detector backed by the Silero VAD v5 ONNX model.
/// Voice activity detector backed by the Silero VAD v6 ONNX model.
///
/// Uses an LSTM neural network to produce continuous speech probability
/// scores (0.0–1.0). Internal hidden state and a context buffer persist
Expand Down
13 changes: 11 additions & 2 deletions crates/wavekat-vad/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -110,15 +110,24 @@
//! ## ONNX model downloads
//!
//! The Silero, TEN-VAD, and FireRedVAD backends download their ONNX models
//! automatically at build time. For offline or CI builds, set environment
//! variables to point to local model files:
//! automatically at build time. The Silero backend is pinned to **v6.2.1** by
//! default.
//!
//! For offline or CI builds, set environment variables to point to local model
//! files:
//!
//! ```sh
//! SILERO_MODEL_PATH=/path/to/silero_vad.onnx cargo build --features silero
//! TEN_VAD_MODEL_PATH=/path/to/ten-vad.onnx cargo build --features ten-vad
//! FIRERED_MODEL_PATH=/path/to/model.onnx FIRERED_CMVN_PATH=/path/to/cmvn.ark cargo build --features firered
//! ```
//!
//! To use a different Silero model version, override the download URL:
//!
//! ```sh
//! SILERO_MODEL_URL=https://github.com/snakers4/silero-vad/raw/v6.0/src/silero_vad/data/silero_vad.onnx cargo build --features silero
//! ```
//!
//! # Error handling
//!
//! All backends return [`Result<f32, VadError>`]. Check a backend's
Expand Down