⚡️ Speed up function create_tiles by 7%
#785
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 7% (0.07x) speedup for
create_tilesininference/core/utils/drawing.py⏱️ Runtime :
138 milliseconds→129 milliseconds(best of14runs)📝 Explanation and details
The optimized version improves the
_generate_tilesfunction by eliminating inefficient list operations and generator overhead.Key optimization: The original code used
create_batches()generator plus nested while loops with repeatedappend()operations to pad missing images. The optimized version:total_slots = rows * columns) and padding requirements in one stepimages + [pad_img] * (total_slots - n_images)instead of iterative appends[images[i * columns:(i + 1) * columns] for i in range(rows)]instead of thecreate_batches()generatorPerformance impact: The line profiler shows the optimization reduces
_generate_tilesexecution time from 138ms to 112ms (19% faster). This eliminates the generator overhead and reduces list mutation operations from O(missing_images) individual appends to a single O(1) list concatenation.Workload benefits: Based on the function references,
create_tilesis called in real-time video streaming workflows for displaying prediction visualizations. The 7% overall speedup becomes significant when processing continuous video frames, where this function is called repeatedly in the display pipeline. The optimization is particularly effective for test cases with many images (like the 999-image test showing 1.08% improvement) where padding operations are more frequent.Test case performance: The optimization shows consistent small improvements across most test cases (1-11% faster), with the largest gains in scenarios requiring significant grid padding or large numbers of images.
✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
inference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_all_imagesinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_all_images_and_custom_colorsinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_all_images_and_custom_gridinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_all_images_and_custom_grid_to_small_to_fit_imagesinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_four_imagesinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_one_imageinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_one_image_and_enforced_gridinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_three_imagesinference/unit_tests/core/utils/test_drawing.py::test_create_tiles_with_two_images🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-create_tiles-miqli9q8and push.