⚡️ Speed up method BasicTokenizer._tokenize_chinese_chars by 12%
#879
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 12% (0.12x) speedup for
BasicTokenizer._tokenize_chinese_charsinsrc/transformers/models/prophetnet/tokenization_prophetnet.py⏱️ Runtime :
1.60 milliseconds→1.43 milliseconds(best of209runs)📝 Explanation and details
The optimized code achieves an 11% speedup through two key optimizations in the
_tokenize_chinese_charsmethod:1. Method Localization: The optimization moves frequently accessed methods into local variables (
is_chinese_char = self._is_chinese_charandappend = output.append). This eliminates the overhead of attribute lookups during the tight loop over each character. In Python, local variable access is faster than attribute access because it avoids the attribute resolution mechanism.2. Precomputed Range Storage: The Chinese character ranges are now stored as tuples in
self._chinese_char_rangesduring initialization rather than being hardcoded in the_is_chinese_charmethod. While the range checking logic remains the same, this preparation centralizes the Unicode block definitions and slightly improves cache locality.Performance Impact: The test results show the optimization is most effective for:
The optimization provides minimal overhead for edge cases like empty strings or ASCII-only text, with some showing negligible slowdown due to the additional local variable assignments.
Why It Works: The speedup comes primarily from reducing Python's attribute lookup overhead during character iteration. Since
_tokenize_chinese_charsprocesses text character-by-character and calls_is_chinese_charandoutput.appendfor each character, eliminating repeated attribute lookups provides measurable performance gains, especially for longer texts where the loop executes hundreds or thousands of times.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-BasicTokenizer._tokenize_chinese_chars-misiloktand push.