⚡️ Speed up method KalmanFilterXYWH.project by 29%
#41
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 29% (0.29x) speedup for
KalmanFilterXYWH.projectinultralytics/trackers/utils/kalman_filter.py⏱️ Runtime :
13.3 milliseconds→10.3 milliseconds(best of230runs)📝 Explanation and details
The optimization achieves a 29% speedup by eliminating expensive NumPy function calls and leveraging more efficient matrix operations:
Key optimizations:
Avoided
np.diag(np.square())overhead: The original code usednp.diag(np.square(std))which required creating an intermediate squared array and then a diagonal matrix. The optimized version directly constructs the 4x4 diagonal matrix usingnp.zeros()and assigns diagonal elements withinnovation_cov.flat[::5] = std_sq, eliminating two function call overheads.Replaced
np.linalg.multi_dot()with@operator: For the 3-matrix multiplication_update_mat @ covariance @ _update_mat.T, the@operator is more efficient thannp.linalg.multi_dot()which has additional overhead for analyzing the optimal multiplication order - unnecessary for this simple case.Cached repeated array accesses: Instead of accessing
mean[2]andmean[3]multiple times (4 times each in the original), the optimized version stores them aswandh, reducing array indexing overhead.Used vectorized array creation: The
stdcalculation is now done with a singlenp.array()call instead of creating a Python list first, reducing conversion overhead.Performance impact: The line profiler shows the most expensive operations were reduced:
Test case benefits: The optimization shows consistent 20-45% speedups across all test scenarios, with particularly strong gains for edge cases with small values (48.9% faster) and error cases (up to 269% faster), making it robust for diverse tracking scenarios where this Kalman filter projects state distributions to measurement space.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-KalmanFilterXYWH.project-mir9hyaeand push.