⚡️ Speed up function create_array by 11%
#780
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 11% (0.11x) speedup for
create_arrayininference/core/workflows/execution_engine/v1/executor/output_constructor.py⏱️ Runtime :
24.3 milliseconds→21.9 milliseconds(best of45runs)📝 Explanation and details
The optimization improves performance by replacing expensive repeated boolean indexing with efficient grouping. The key changes are:
What was optimized:
np.unique()and builds anidx_mapdictionary to group rows by their first column index upfront, avoiding repeatedindices[:, 0] == idxboolean operations inside the loop.indices[idx_selector]) with direct integer indexing (indices[idx_indices]), which is faster for NumPy arrays.Why it's faster:
The original code performs boolean indexing (
indices[:, 0] == idx) for every possible index in the range, creating O(n×m) operations where n is the array size and m is the max index. The optimization reduces this to O(n) by grouping once upfront, then using direct integer indexing.Performance characteristics:
np.unique()and dictionary creation costsImpact on workloads:
Based on the function references,
create_arrayis called in workflow execution engine output construction - a critical path for processing batch outputs. The optimization particularly benefits workflows with sparse index patterns or large batch sizes, which are common in computer vision pipelines where not all detection slots are filled.✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
workflows/unit_tests/execution_engine/executor/test_output_constructor.py::test_create_array_for_dimension_oneworkflows/unit_tests/execution_engine/executor/test_output_constructor.py::test_create_array_for_dimension_threeworkflows/unit_tests/execution_engine/executor/test_output_constructor.py::test_create_array_for_dimension_two🌀 Generated Regression Tests and Runtime
⏪ Replay Tests and Runtime
To edit these changes
git checkout codeflash/optimize-create_array-miqiwooeand push.