-
-
Notifications
You must be signed in to change notification settings - Fork 34.2k
util: optimize toASCIILower function using V8's native toLowerCase #61107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
util: optimize toASCIILower function using V8's native toLowerCase #61107
Conversation
bf24628 to
b391b73
Compare
b391b73 to
6b08d1a
Compare
6b08d1a to
57b817f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not identical (and there is a reason why the function checks character ranges)
This function is supposed to do ASCII lowercasing, not Unicode lowercasing
TextDecoder has the same issue for label normalization btw, as it incorrectly uses .toLowerCase at
Line 332 in b1c01fc
| return encodings.get(trimAsciiWhitespace(label.toLowerCase())); |
See spec: https://tc39.es/ecma262/#sec-string.prototype.tolowercase and notes there
As an example, '\u212A'.toLowerCase() === 'k'
toLowerCase can convert non-ascii characters into ascii characters (which makes e.g. current TextDecoder label handling incorrect, among all its other bugs)
text/html charset=gb\u212A should not turn into text/html charset=gbk
|
What would be helpful is to move |
Thanks I will try this idea |
Co-authored-by: Nikita Skovoroda <chalkerx@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #61107 +/- ##
==========================================
- Coverage 88.53% 88.52% -0.01%
==========================================
Files 703 703
Lines 208546 208548 +2
Branches 40217 40224 +7
==========================================
- Hits 184634 184620 -14
- Misses 15926 15958 +32
+ Partials 7986 7970 -16
🚀 New features to boost your workflow:
|
RafaelGSS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. With green CI and confirmed by benchmark CI.
benchmark results (mimetype-instantiation.js):
application/ecmascript +54%
text/html charset=gbk +24%
text/html long...=x charset=gbk +21%