⚡️ Speed up method `KalmanFilterXYWH.update` by 6% #42

codeflash-ai · 2025-12-04T10:01:43Z

📄 6% (0.06x) speedup for `KalmanFilterXYWH.update` in `ultralytics/trackers/utils/kalman_filter.py`

⏱️ Runtime : 12.8 milliseconds → 12.0 milliseconds (best of 163 runs)

📝 Explanation and details

The optimization improves performance by 6% through three key linear algebra optimizations in the Kalman filter update step:

What optimizations were applied:

Precomputed intermediate matrix multiplication: Extracted np.dot(covariance, self._update_mat.T) into a reusable cov_update variable instead of computing it inline within the cho_solve call
Replaced multi_dot with direct matmul calls: Split the three-matrix multiplication np.linalg.multi_dot((kalman_gain, projected_cov, kalman_gain.T)) into two sequential np.matmul operations
Added memory optimization flag: Used overwrite_b=True in scipy.linalg.cho_solve to allow in-place operations on the input array

Why these optimizations provide speedup:

Reduced redundant computation: The precomputed cov_update eliminates duplicate matrix multiplication that was happening inside cho_solve
Optimized matrix operations: For exactly three matrices, two sequential np.matmul calls are faster than np.linalg.multi_dot, which has overhead for handling variable numbers of matrices and additional checks
Memory efficiency: The overwrite_b=True parameter reduces memory allocations by allowing SciPy to modify the input array in-place during the solve operation

Performance characteristics from test results:
The optimization shows consistent 6-7% speedups on large-scale test cases (batch processing, repeated updates, randomized inputs) while maintaining identical numerical results. Single-operation tests show more variable performance due to measurement noise, but the optimization particularly benefits workloads that perform many Kalman filter updates, which is typical in object tracking scenarios where this filter would be applied frame-by-frame to multiple tracked objects.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 794 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import numpy as np

# imports
import pytest

# function to test
from ultralytics.trackers.utils.kalman_filter import KalmanFilterXYWH

# unit tests

# ---------------------- BASIC TEST CASES ----------------------


def test_update_basic_identity_covariance():
    """Test update with identity covariance and simple mean/measurement."""
    kf = KalmanFilterXYWH()
    mean = np.array([0, 0, 1, 1, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8)
    measurement = np.array([1, 2, 3, 4], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 44.1μs -> 45.8μs (3.64% slower)


def test_update_basic_zero_velocity():
    """Test update when velocities are zero and measurement matches mean position."""
    kf = KalmanFilterXYWH()
    mean = np.array([5, 5, 2, 2, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8)
    measurement = np.array([5, 5, 2, 2], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 39.4μs -> 39.5μs (0.324% slower)


def test_update_basic_nontrivial_covariance():
    """Test update with nontrivial covariance matrix."""
    kf = KalmanFilterXYWH()
    mean = np.array([10, 20, 5, 5, 0, 0, 0, 0], dtype=float)
    covariance = np.diag([1, 2, 3, 4, 5, 6, 7, 8])
    measurement = np.array([12, 18, 6, 4], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 41.6μs -> 40.5μs (2.90% faster)


# ---------------------- EDGE TEST CASES ----------------------


def test_update_edge_large_values():
    """Test update with very large values."""
    kf = KalmanFilterXYWH()
    mean = np.array([1e6, -1e6, 1e5, 1e5, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8) * 1e6
    measurement = np.array([1e6 + 1, -1e6 - 1, 1e5 + 2, 1e5 - 2], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 36.0μs -> 37.2μs (3.31% slower)


def test_update_edge_small_values():
    """Test update with very small values."""
    kf = KalmanFilterXYWH()
    mean = np.array([1e-6, -1e-6, 1e-5, 1e-5, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8) * 1e-6
    measurement = np.array([2e-6, -2e-6, 2e-5, 2e-5], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 35.6μs -> 36.2μs (1.66% slower)


def test_update_edge_zero_covariance():
    """Test update with zero covariance (should not crash, but may not update)."""
    kf = KalmanFilterXYWH()
    mean = np.array([1, 1, 1, 1, 0, 0, 0, 0], dtype=float)
    covariance = np.zeros((8, 8))
    measurement = np.array([2, 2, 2, 2], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 36.8μs -> 38.0μs (3.07% slower)


def test_update_edge_negative_covariance():
    """Test update with negative values in covariance (should raise or produce non-PSD matrix)."""
    kf = KalmanFilterXYWH()
    mean = np.array([0, 0, 1, 1, 0, 0, 0, 0], dtype=float)
    covariance = -np.eye(8)  # Not positive semi-definite
    measurement = np.array([1, 1, 1, 1], dtype=float)
    with pytest.raises(np.linalg.LinAlgError):
        # Should fail in cho_factor due to non-PSD matrix
        kf.update(mean, covariance, measurement)  # 23.1μs -> 26.1μs (11.7% slower)


def test_update_edge_invalid_shapes():
    """Test update with invalid shapes for mean, covariance, or measurement."""
    kf = KalmanFilterXYWH()
    mean = np.array([0, 0, 1, 1, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8)
    # Measurement too short
    measurement = np.array([1, 2, 3], dtype=float)
    with pytest.raises(ValueError):
        kf.update(mean, covariance, measurement)  # 37.4μs -> 37.2μs (0.678% faster)
    # Mean wrong shape
    mean_bad = np.array([0, 0, 1, 1], dtype=float)
    with pytest.raises(ValueError):
        kf.update(mean_bad, covariance, np.array([1, 2, 3, 4], dtype=float))  # 8.57μs -> 8.46μs (1.28% faster)
    # Covariance wrong shape
    covariance_bad = np.eye(4)
    with pytest.raises(ValueError):
        kf.update(
            np.array([0, 0, 1, 1, 0, 0, 0, 0], dtype=float), covariance_bad, np.array([1, 2, 3, 4], dtype=float)
        )  # 10.4μs -> 9.94μs (4.29% faster)


def test_update_edge_nan_inf_input():
    """Test update with NaN or Inf in inputs."""
    kf = KalmanFilterXYWH()
    mean = np.array([0, 0, np.nan, 1, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8)
    measurement = np.array([1, 1, 1, 1], dtype=float)
    with pytest.raises(ValueError):
        kf.update(mean, covariance, measurement)
    mean = np.array([0, 0, 1, 1, 0, 0, 0, 0], dtype=float)
    measurement = np.array([np.inf, 1, 1, 1], dtype=float)
    with pytest.raises(ValueError):
        kf.update(mean, covariance, measurement)


# ---------------------- LARGE SCALE TEST CASES ----------------------


def test_update_large_batch():
    """Test update with a batch of 100 means/covariances/measurements."""
    kf = KalmanFilterXYWH()
    n = 100
    means = np.tile(np.array([10, 20, 30, 40, 0, 0, 0, 0], dtype=float), (n, 1))
    covariances = np.tile(np.eye(8)[None, :, :], (n, 1, 1)) * 2.0
    measurements = np.tile(np.array([11, 19, 31, 39], dtype=float), (n, 1))
    # Run update for each batch element
    for i in range(n):
        new_mean, new_cov = kf.update(means[i], covariances[i], measurements[i])  # 1.51ms -> 1.42ms (6.23% faster)


def test_update_large_extreme_values():
    """Test update with large batch and extreme values."""
    kf = KalmanFilterXYWH()
    n = 50
    means = np.linspace(-1e3, 1e3, n * 8).reshape(n, 8)
    covariances = np.tile(np.eye(8)[None, :, :], (n, 1, 1)) * 1e3
    measurements = np.linspace(-2e3, 2e3, n * 4).reshape(n, 4)
    for i in range(n):
        new_mean, new_cov = kf.update(means[i], covariances[i], measurements[i])  # 757μs -> 711μs (6.49% faster)


def test_update_large_randomized():
    """Test update with randomized inputs and check determinism."""
    kf = KalmanFilterXYWH()
    np.random.seed(42)
    n = 100
    means = np.random.randn(n, 8) * 10
    covariances = np.tile(np.eye(8)[None, :, :], (n, 1, 1)) * np.random.uniform(1, 5, n)[:, None, None]
    measurements = np.random.randn(n, 4) * 10
    results = []
    for i in range(n):
        new_mean, new_cov = kf.update(means[i], covariances[i], measurements[i])  # 1.51ms -> 1.42ms (6.73% faster)
        results.append((new_mean, new_cov))
    # Run again with same seed and check for exact match
    np.random.seed(42)
    means2 = np.random.randn(n, 8) * 10
    covariances2 = np.tile(np.eye(8)[None, :, :], (n, 1, 1)) * np.random.uniform(1, 5, n)[:, None, None]
    measurements2 = np.random.randn(n, 4) * 10
    for i in range(n):
        new_mean2, new_cov2 = kf.update(means2[i], covariances2[i], measurements2[i])  # 1.45ms -> 1.36ms (6.52% faster)


# ---------------------- ADDITIONAL EDGE CASES ----------------------


def test_update_edge_extreme_aspect_ratio_width_height():
    """Test update with extreme width/height/aspect ratio values."""
    kf = KalmanFilterXYWH()
    mean = np.array([0, 0, 1e-9, 1e9, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8)
    measurement = np.array([0, 0, 1e-9, 1e9], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 40.0μs -> 41.5μs (3.77% slower)


def test_update_edge_zero_width_height():
    """Test update with zero width and height in measurement."""
    kf = KalmanFilterXYWH()
    mean = np.array([10, 10, 5, 5, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8)
    measurement = np.array([10, 10, 0, 0], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 35.3μs -> 35.5μs (0.782% slower)


def test_update_edge_negative_width_height():
    """Test update with negative width and height in measurement (should not crash but may be physically invalid)."""
    kf = KalmanFilterXYWH()
    mean = np.array([10, 10, 5, 5, 0, 0, 0, 0], dtype=float)
    covariance = np.eye(8)
    measurement = np.array([10, 10, -5, -5], dtype=float)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 34.8μs -> 35.2μs (0.992% slower)


# ---------------------- FUNCTIONALITY AND INVARIANT TESTS ----------------------


def test_update_invariant_covariance_symmetric():
    """Covariance matrix after update should remain symmetric."""
    kf = KalmanFilterXYWH()
    mean = np.random.randn(8)
    covariance = np.eye(8)
    measurement = np.random.randn(4)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 36.4μs -> 36.6μs (0.574% slower)


def test_update_invariant_mean_dimension():
    """Mean vector after update should remain 8-dimensional."""
    kf = KalmanFilterXYWH()
    mean = np.random.randn(8)
    covariance = np.eye(8)
    measurement = np.random.randn(4)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 35.2μs -> 37.6μs (6.46% slower)


def test_update_invariant_covariance_dimension():
    """Covariance matrix after update should remain 8x8."""
    kf = KalmanFilterXYWH()
    mean = np.random.randn(8)
    covariance = np.eye(8)
    measurement = np.random.randn(4)
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 36.6μs -> 38.2μs (4.29% slower)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import numpy as np

# imports
import pytest
from ultralytics.trackers.utils.kalman_filter import KalmanFilterXYWH

# unit tests


@pytest.fixture
def kf():
    # Fixture for a fresh KalmanFilterXYWH instance
    return KalmanFilterXYWH()


# 1. Basic Test Cases


def test_update_basic_identity_covariance(kf):
    # Basic: mean and measurement are close, identity covariance
    mean = np.array([1.0, 2.0, 3.0, 4.0, 0, 0, 0, 0])
    covariance = np.eye(8)
    measurement = np.array([1.1, 2.1, 3.1, 4.1])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 37.4μs -> 38.9μs (3.82% slower)
    # The updated mean should be closer to measurement than the prior mean
    for i in range(4):
        pass
    eigvals = np.linalg.eigvalsh(new_cov)


def test_update_basic_zero_velocity(kf):
    # Basic: zero velocity, measurement matches mean
    mean = np.array([5, 6, 7, 8, 0, 0, 0, 0])
    covariance = np.eye(8)
    measurement = np.array([5, 6, 7, 8])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 49.8μs -> 49.6μs (0.399% faster)


def test_update_basic_nontrivial_velocity(kf):
    # Basic: nonzero velocity, measurement is offset
    mean = np.array([10, 20, 30, 40, 1, -1, 2, -2])
    covariance = np.eye(8) * 2
    measurement = np.array([11, 19, 32, 38])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 44.9μs -> 45.6μs (1.53% slower)
    # The mean should move toward the measurement
    for i in range(4):
        pass


def test_update_basic_diagonal_covariance(kf):
    # Basic: diagonal covariance, measurement far from mean
    mean = np.array([0, 0, 0, 1, 0, 0, 0, 0])
    covariance = np.eye(8) * 5
    measurement = np.array([10, -10, 5, 2])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 44.0μs -> 44.8μs (1.77% slower)
    # Updated mean should move toward measurement
    for i in range(4):
        pass


# 2. Edge Test Cases


def test_update_edge_singular_covariance(kf):
    # Edge: covariance with zero in one direction (singular)
    mean = np.array([1, 2, 3, 4, 0, 0, 0, 0])
    covariance = np.eye(8)
    covariance[0, 0] = 0.0  # x direction has no uncertainty
    measurement = np.array([1, 2.5, 3.5, 4.5])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 44.5μs -> 45.4μs (1.82% slower)
    # Other directions should move toward measurement
    for i in range(1, 4):
        pass


def test_update_edge_large_covariance(kf):
    # Edge: very large covariance (high uncertainty)
    mean = np.array([100, 200, 300, 400, 0, 0, 0, 0])
    covariance = np.eye(8) * 1e6
    measurement = np.array([110, 210, 290, 410])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 43.7μs -> 45.3μs (3.63% slower)
    # Mean should move close to measurement (high uncertainty)
    for i in range(4):
        pass


def test_update_edge_small_covariance(kf):
    # Edge: very small covariance (low uncertainty)
    mean = np.array([10, 20, 30, 40, 0, 0, 0, 0])
    covariance = np.eye(8) * 1e-6
    measurement = np.array([100, 200, 300, 400])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 42.6μs -> 45.7μs (6.63% slower)


def test_update_edge_extreme_measurement_values(kf):
    # Edge: measurement has extreme values
    mean = np.array([0, 0, 0, 1, 0, 0, 0, 0])
    covariance = np.eye(8)
    measurement = np.array([1e9, -1e9, 1e-9, 1e9])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 45.0μs -> 46.0μs (2.03% slower)
    # Updated mean should move toward measurement, but not exactly to it
    for i in range(4):
        pass


def test_update_edge_negative_width_height(kf):
    # Edge: measurement with negative width/height (physically invalid, but should not crash)
    mean = np.array([0, 0, 1, 1, 0, 0, 0, 0])
    covariance = np.eye(8)
    measurement = np.array([0, 0, -1, -1])
    new_mean, new_cov = kf.update(mean, covariance, measurement)  # 44.8μs -> 45.7μs (2.01% slower)


def test_update_edge_non_square_covariance(kf):
    # Edge: covariance is not 8x8
    mean = np.array([1, 2, 3, 4, 0, 0, 0, 0])
    covariance = np.eye(7)
    measurement = np.array([1, 2, 3, 4])
    with pytest.raises(ValueError):
        kf.update(mean, covariance, measurement)  # 36.0μs -> 35.9μs (0.058% faster)


def test_update_edge_non_vector_mean(kf):
    # Edge: mean is not a vector
    mean = np.eye(8)
    covariance = np.eye(8)
    measurement = np.array([1, 2, 3, 4])
    with pytest.raises(ValueError):
        kf.update(mean, covariance, measurement)  # 62.1μs -> 60.0μs (3.38% faster)


def test_update_large_scale_many_updates(kf):
    # Large scale: apply update repeatedly, simulating a track
    mean = np.array([0, 0, 10, 10, 1, 1, 0.5, 0.5])
    covariance = np.eye(8) * 5
    for i in range(100):
        measurement = np.array([i, i, 10 + i / 10, 10 + i / 10])
        mean, covariance = kf.update(mean, covariance, measurement)  # 1.73ms -> 1.61ms (7.23% faster)
        eigvals = np.linalg.eigvalsh(covariance)


def test_update_large_scale_batch(kf):
    # Large scale: update on a batch of means/covariances/measurements
    means = np.tile(np.array([10, 20, 30, 40, 0, 0, 0, 0]), (100, 1))
    covariances = np.tile(np.eye(8), (100, 1, 1))
    measurements = np.tile(np.array([12, 22, 32, 42]), (100, 1))
    # Run update for each
    for i in range(100):
        new_mean, new_cov = kf.update(means[i], covariances[i], measurements[i])  # 2.15ms -> 2.02ms (6.35% faster)
        # Updated mean should move toward measurement
        for j in range(4):
            pass
        eigvals = np.linalg.eigvalsh(new_cov)


def test_update_large_scale_extreme_values_batch(kf):
    # Large scale: batch with extreme values
    means = np.tile(np.array([1e6, -1e6, 1e-6, -1e-6, 0, 0, 0, 0]), (100, 1))
    covariances = np.tile(np.eye(8) * 1e3, (100, 1, 1))
    measurements = np.tile(np.array([1e6 + 10, -1e6 - 10, 1e-6 + 1e-7, -1e-6 - 1e-7]), (100, 1))
    for i in range(100):
        new_mean, new_cov = kf.update(means[i], covariances[i], measurements[i])  # 1.70ms -> 1.58ms (7.52% faster)
        # Mean should move toward measurement
        for j in range(4):
            pass
        eigvals = np.linalg.eigvalsh(new_cov)


def test_update_large_scale_randomized(kf):
    # Large scale: randomized means, covariances, measurements
    rng = np.random.default_rng(42)
    for _ in range(50):
        mean = rng.normal(0, 100, size=8)
        cov = np.eye(8) * rng.uniform(1, 100)
        measurement = rng.normal(0, 100, size=4)
        new_mean, new_cov = kf.update(mean, cov, measurement)  # 938μs -> 876μs (7.13% faster)
        # Mean should move toward measurement
        for i in range(4):
            pass
        eigvals = np.linalg.eigvalsh(new_cov)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-KalmanFilterXYWH.update-mir9nm9g and push.

The optimization improves performance by **6%** through three key linear algebra optimizations in the Kalman filter update step: **What optimizations were applied:** 1. **Precomputed intermediate matrix multiplication**: Extracted `np.dot(covariance, self._update_mat.T)` into a reusable `cov_update` variable instead of computing it inline within the `cho_solve` call 2. **Replaced `multi_dot` with direct `matmul` calls**: Split the three-matrix multiplication `np.linalg.multi_dot((kalman_gain, projected_cov, kalman_gain.T))` into two sequential `np.matmul` operations 3. **Added memory optimization flag**: Used `overwrite_b=True` in `scipy.linalg.cho_solve` to allow in-place operations on the input array **Why these optimizations provide speedup:** - **Reduced redundant computation**: The precomputed `cov_update` eliminates duplicate matrix multiplication that was happening inside `cho_solve` - **Optimized matrix operations**: For exactly three matrices, two sequential `np.matmul` calls are faster than `np.linalg.multi_dot`, which has overhead for handling variable numbers of matrices and additional checks - **Memory efficiency**: The `overwrite_b=True` parameter reduces memory allocations by allowing SciPy to modify the input array in-place during the solve operation **Performance characteristics from test results:** The optimization shows consistent **6-7% speedups** on large-scale test cases (batch processing, repeated updates, randomized inputs) while maintaining identical numerical results. Single-operation tests show more variable performance due to measurement noise, but the optimization particularly benefits workloads that perform many Kalman filter updates, which is typical in object tracking scenarios where this filter would be applied frame-by-frame to multiple tracked objects.

codeflash-ai bot requested a review from mashraf-222 December 4, 2025 10:01

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `KalmanFilterXYWH.update` by 6% #42

⚡️ Speed up method `KalmanFilterXYWH.update` by 6% #42

Uh oh!

codeflash-ai bot commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method KalmanFilterXYWH.update by 6% #42

Are you sure you want to change the base?

⚡️ Speed up method KalmanFilterXYWH.update by 6% #42

Uh oh!

Conversation

codeflash-ai bot commented Dec 4, 2025

📄 6% (0.06x) speedup for KalmanFilterXYWH.update in ultralytics/trackers/utils/kalman_filter.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `KalmanFilterXYWH.update` by 6% #42

⚡️ Speed up method `KalmanFilterXYWH.update` by 6% #42

📄 6% (0.06x) speedup for `KalmanFilterXYWH.update` in `ultralytics/trackers/utils/kalman_filter.py`