⚡️ Speed up function compute_intermediate_size by 13%
#873
+1
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 13% (0.13x) speedup for
compute_intermediate_sizeinsrc/transformers/models/olmo2/convert_olmo2_weights_to_hf.py⏱️ Runtime :
646 microseconds→570 microseconds(best of171runs)📝 Explanation and details
The optimization replaces
int(8 * n / 3)with(8 * n // 3)in the computation, achieving a 13% speedup by eliminating unnecessary floating-point operations.Key optimization:
int(ffn_dim_multiplier * int(8 * n / 3))performs floating-point division (/) then converts to intint(ffn_dim_multiplier * (8 * n // 3))uses integer floor division (//) directlyWhy this is faster:
//) operates entirely in integer arithmetic, avoiding the overhead of converting to float and back to int/operator in Python creates a float intermediate result that must be cast back withint(), adding unnecessary computation//is the more direct and efficient operationPerformance characteristics:
Mathematical equivalence:
Both expressions produce identical results since
int(8 * n / 3)and(8 * n // 3)yield the same integer quotient for all integer inputs, preserving all functional behavior while improving performance through more efficient arithmetic operations.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-compute_intermediate_size-misg3p9tand push.