⚡️ Speed up function get_writer by 26%
#377
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 26% (0.26x) speedup for
get_writerinpandas/io/excel/_util.py⏱️ Runtime :
17.2 microseconds→13.6 microseconds(best of44runs)📝 Explanation and details
The optimization replaces a try/except block with a conditional check, achieving a 25% speedup by avoiding Python's expensive exception handling mechanism.
Key changes:
try: return _writers[engine_name] except KeyError:withif engine_name in _writers: return _writers[engine_name]Why this optimization works:
In Python, exceptions are significantly more expensive than conditional checks. The original code used exceptions for control flow - checking if a key exists by catching KeyError. The optimized version uses dictionary membership testing (
inoperator) which is much faster than exception handling. When the engine exists (the common case), we avoid all exception overhead while maintaining identical behavior.Performance characteristics by test case:
incheck, but this is negligible in absolute terms (nanoseconds)Impact on workloads:
Based on the function reference,
get_writer()is called fromExcelWriter.__new__()during Excel file creation. This is likely a hot path when processing multiple Excel files or sheets. The optimization particularly benefits error-heavy workloads (invalid engine names) while having minimal impact on successful lookups, making it a net win for robustness and performance.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-get_writer-mihdjd73and push.