Skip to content

Fix fractional_beat clustering with sparse pulse data (Issue #28)#30

Merged
jon-myers merged 1 commit intomainfrom
fix-fractional-beat-clustering
Sep 9, 2025
Merged

Fix fractional_beat clustering with sparse pulse data (Issue #28)#30
jon-myers merged 1 commit intomainfrom
fix-fractional-beat-clustering

Conversation

@jon-myers
Copy link
Contributor

Summary

COMPREHENSIVE SOLUTION: Resolves Issue #28 by implementing proportional timing fallback for sparse pulse data in real transcription meters, ensuring fractional_beat values properly distribute 0.0-1.0 within beats.

Root Cause Analysis

The clustering issue was not in position truncation logic, but in the fundamental pulse data difference:

Real Transcription Meters (Broken)

  • Theoretical: _pulses_per_cycle=32, _pulse_dur=0.128s
  • Actual: all_pulses count=2 ❌ (only 6% of expected pulses!)
  • Result: Pulse-based calculations fail → fractional_beat clustering 0.000-0.026

Synthetic Meters (Working)

  • Theoretical: _pulses_per_cycle=32, _pulse_dur=0.128s
  • Actual: all_pulses count=32 ✅ (100% complete pulse grid)
  • Result: Pulse-based calculations succeed → fractional_beat range 0.0-1.0

Solution Implementation

1. Sparse Pulse Detection

if len(self.all_pulses) < expected_pulses * 0.5:  # Less than 50% of expected
    # Automatic fallback to proportional timing

2. Proportional Timing Methods

  • _calculate_proportional_level_start_time(): Uses cycle division instead of pulse indexing
  • _calculate_proportional_level_duration(): Uses hierarchical ratios instead of pulse differences

3. Intelligent Fallback Strategy

  • Complete pulse data: Use precise pulse-based timing (synthetic meters)
  • Sparse pulse data: Use proportional timing (real transcription meters)
  • Bounds errors: Graceful fallback to proportional timing
  • Overflow cases: Proportional timing for edge cases

Results Comparison

Before Fix (Broken Clustering)

Real transcription fractional_beat: [0.000, 0.007, 0.013, 0.020, 0.026]
Range: 0.000 → 0.026 (clustering near 0!)

After Fix (Proper Distribution)

Real transcription fractional_beat: [0.104, 0.298, 0.492, 0.687, 0.881]  
Range: 0.104 → 0.881 (smooth 0.0-1.0 distribution!)

Consistency Verification

Time 4.2s: Real=0.104 vs Synthetic=0.104 ✅ (nearly identical!)
Time 4.6s: Real=0.492 vs Synthetic=0.492 ✅
Time 4.8s: Real=0.687 vs Synthetic=0.687 ✅

Testing Results

  • All 29 musical time tests pass (no regressions)
  • Real transcription data works correctly
  • Synthetic meter functionality preserved
  • Consistent behavior between both data types
  • Automatic detection and fallback

Impact

  • Visualization Fixed: Musical events now distribute properly across beat durations
  • API Preserved: No breaking changes to existing functionality
  • Robustness: Handles both complete and sparse pulse data seamlessly
  • Performance: Minimal overhead - only activates when needed

Backward Compatibility

This fix is fully backward compatible:

  • Existing synthetic meter usage unchanged
  • Real transcription data now works correctly
  • No API modifications required
  • Automatic detection and handling

🤖 Generated with Claude Code

…sue #28)

**Root Cause Analysis:**
The fractional_beat clustering issue was caused by sparse pulse data in real
transcription meters. While synthetic meters have complete pulse arrays (32/32
pulses), real transcription data contains only manually annotated pulses (2/32
pulses), breaking pulse-based timing calculations.

**Key Discovery:**
- Real meters: _pulses_per_cycle=32, all_pulses count=2 (6% of expected!)
- Synthetic meters: _pulses_per_cycle=32, all_pulses count=32 (100% complete)
- Previous fix using full positions was ineffective - didn't address pulse sparsity

**Solution:**
1. **Detection**: Check if pulse count < 50% of expected pulses
2. **Fallback**: Use proportional cycle timing instead of broken pulse indexing
3. **Preservation**: Maintain existing pulse-based functionality for complete data

**New Methods:**
- `_calculate_proportional_level_start_time()`: Beat boundaries via cycle division
- `_calculate_proportional_level_duration()`: Unit durations via hierarchical ratios

**Results Validation:**
- Real data fractional_beat: 0.000-0.026 → 0.104-0.881 ✅
- Synthetic data: unchanged (0.104-0.881) ✅
- Near-identical values between real and synthetic meters ✅
- All 29 tests passing with no regressions ✅

This fix ensures musical visualizations properly distribute events across beat
durations instead of clustering them at beat boundaries, resolving the core
issue described in GitHub Issue #28.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

📦 Test Package Built Successfully!

This PR has been automatically built and uploaded to TestPyPI for testing.

🔗 TestPyPI Link: https://test.pypi.org/project/idtap/

To test this version:

pip install --index-url https://test.pypi.org/simple/ idtap

✅ All tests passed and package builds successfully.

@jon-myers jon-myers merged commit e8282ca into main Sep 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant