Skip to content

fix: reduce diarization speaker label noise#5

Merged
loookashow merged 3 commits intomainfrom
fix/speaker-postprocessing
May 4, 2026
Merged

fix: reduce diarization speaker label noise#5
loookashow merged 3 commits intomainfrom
fix/speaker-postprocessing

Conversation

@loookashow
Copy link
Copy Markdown
Contributor

Summary

This PR reduces speaker-label noise in two places:

  • refines Spectral Clustering labels with spherical centroid reassignment
  • expands silhouette refinement around the BIC estimate instead of only trying k, k+1, k+2
  • converts overlapping embedding windows into a non-overlapping final timeline with local majority smoothing

Benchmark

Evaluated VoxConverse locally:

Before:

  • Weighted DER: ~8.46%
  • Mean DER: ~8.45%
  • Median DER: ~2.72%
  • Exact speaker-count match: 111/216

After:

  • Weighted DER: 5.16%
  • Mean DER: 6.65%
  • Median DER: 2.37%
  • Exact speaker-count match: 117/216

Notes

This improves the benchmark substantially, but it does not fully solve noisy speaker switching on real meeting audio. There are still cases where speaker labels flip too often inside a single speaker turn.

Validation

  • python -m ruff check src/diarize/clustering.py src/diarize/__init__.py tests/test_diarize.py
  • python -m pytest tests/test_diarize.py -q

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@loookashow loookashow merged commit cf41454 into main May 4, 2026
9 checks passed
@loookashow loookashow deleted the fix/speaker-postprocessing branch May 4, 2026 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants