-
-
Notifications
You must be signed in to change notification settings - Fork 33.8k
gh-74902: Add Unicode Grapheme Cluster Break algorithm #143076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4c1bd42
b0585f9
37fa38f
c46e9bd
22cacf6
a95c3cb
ad50831
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -184,6 +184,28 @@ following functions: | |
| '0041 0303' | ||
|
|
||
|
|
||
| .. function:: grapheme_cluster_break(chr, /) | ||
|
|
||
| Returns the Grapheme_Cluster_Break property assigned to the character. | ||
|
|
||
| .. versionadded:: next | ||
|
|
||
|
|
||
| .. function:: indic_conjunct_break(chr, /) | ||
|
|
||
| Returns the Indic_Conjunct_Break property assigned to the character. | ||
|
|
||
| .. versionadded:: next | ||
|
|
||
|
|
||
| .. function:: extended_pictographic(chr, /) | ||
|
|
||
| Returns ``True`` if the character has the Extended_Pictographic property, | ||
| ``False`` otherwise. | ||
|
|
||
| .. versionadded:: next | ||
|
|
||
|
|
||
| .. function:: normalize(form, unistr, /) | ||
|
|
||
| Return the normal form *form* for the Unicode string *unistr*. Valid values for | ||
|
|
@@ -225,6 +247,24 @@ following functions: | |
| .. versionadded:: 3.8 | ||
|
|
||
|
|
||
| .. function:: iter_graphemes(unistr, start=0, end=sys.maxsize, /) | ||
|
|
||
| Returns an iterator to iterate over grapheme clusters. | ||
| With optional *start*, iteration begins at that position. | ||
| With optional *end*, iteration stops at that position. | ||
|
|
||
| Converting an emitted item to string returns a substring corresponding to | ||
| the grapheme cluster. | ||
| Its ``start`` and ``end`` attributes denote the start and end of | ||
| the grapheme cluster. | ||
|
|
||
| It uses extended grapheme cluster rules defined by Unicode | ||
| Standard Annex #29, `"Unicode Text Segmentation" | ||
| <https://www.unicode.org/reports/tr29/>`_. | ||
|
|
||
| .. versionadded:: next | ||
|
|
||
|
|
||
| In addition, the module exposes the following constant: | ||
|
|
||
| .. data:: unidata_version | ||
|
|
@@ -234,7 +274,7 @@ In addition, the module exposes the following constant: | |
|
|
||
| .. data:: ucd_3_2_0 | ||
|
|
||
| This is an object that has the same methods as the entire module, but uses the | ||
| This is an object that has most of the methods of the entire module, but uses the | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sentence is not fully right, but I can’t find the right suggestion with both «most of» and «same as». |
||
| Unicode database version 3.2 instead, for applications that require this | ||
| specific version of the Unicode database (such as IDNA). | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -616,6 +616,221 @@ def test_isxidcontinue(self): | |||||
| self.assertRaises(TypeError, self.db.isxidcontinue) | ||||||
| self.assertRaises(TypeError, self.db.isxidcontinue, 'xx') | ||||||
|
|
||||||
| def test_grapheme_cluster_break(self): | ||||||
| gcb = self.db.grapheme_cluster_break | ||||||
| self.assertEqual(gcb(' '), 'Other') | ||||||
| self.assertEqual(gcb('x'), 'Other') | ||||||
| self.assertEqual(gcb('\U0010FFFF'), 'Other') | ||||||
| self.assertEqual(gcb('\r'), 'CR') | ||||||
| self.assertEqual(gcb('\n'), 'LF') | ||||||
| self.assertEqual(gcb('\0'), 'Control') | ||||||
| self.assertEqual(gcb('\t'), 'Control') | ||||||
| self.assertEqual(gcb('\x1F'), 'Control') | ||||||
| self.assertEqual(gcb('\x7F'), 'Control') | ||||||
| self.assertEqual(gcb('\x9F'), 'Control') | ||||||
| self.assertEqual(gcb('\U000E0001'), 'Control') | ||||||
| self.assertEqual(gcb('\u0300'), 'Extend') | ||||||
| self.assertEqual(gcb('\u200C'), 'Extend') | ||||||
| self.assertEqual(gcb('\U000E01EF'), 'Extend') | ||||||
| self.assertEqual(gcb('\u1159'), 'L') | ||||||
| self.assertEqual(gcb('\u11F9'), 'T') | ||||||
| self.assertEqual(gcb('\uD788'), 'LV') | ||||||
| self.assertEqual(gcb('\uD7A3'), 'LVT') | ||||||
| # New in 5.0.0 | ||||||
| self.assertEqual(gcb('\u05BA'), 'Extend') | ||||||
| self.assertEqual(gcb('\u20EF'), 'Extend') | ||||||
| # New in 5.1.0 | ||||||
| self.assertEqual(gcb('\u2064'), 'Control') | ||||||
| self.assertEqual(gcb('\uAA4D'), 'SpacingMark') | ||||||
| # New in 5.2.0 | ||||||
| self.assertEqual(gcb('\u0816'), 'Extend') | ||||||
| self.assertEqual(gcb('\uA97C'), 'L') | ||||||
| self.assertEqual(gcb('\uD7C6'), 'V') | ||||||
| self.assertEqual(gcb('\uD7FB'), 'T') | ||||||
| # New in 6.0.0 | ||||||
| self.assertEqual(gcb('\u093A'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011002'), 'SpacingMark') | ||||||
| # New in 6.1.0 | ||||||
| self.assertEqual(gcb('\U000E0FFF'), 'Control') | ||||||
| self.assertEqual(gcb('\U00016F7E'), 'SpacingMark') | ||||||
| # New in 6.2.0 | ||||||
| self.assertEqual(gcb('\U0001F1E6'), 'Regional_Indicator') | ||||||
| self.assertEqual(gcb('\U0001F1FF'), 'Regional_Indicator') | ||||||
| # New in 6.3.0 | ||||||
| self.assertEqual(gcb('\u180E'), 'Control') | ||||||
| self.assertEqual(gcb('\u1A1B'), 'Extend') | ||||||
| # New in 7.0.0 | ||||||
| self.assertEqual(gcb('\u0E33'), 'SpacingMark') | ||||||
| self.assertEqual(gcb('\u0EB3'), 'SpacingMark') | ||||||
| self.assertEqual(gcb('\U0001BCA3'), 'Control') | ||||||
| self.assertEqual(gcb('\U0001E8D6'), 'Extend') | ||||||
| self.assertEqual(gcb('\U0001163E'), 'SpacingMark') | ||||||
| # New in 8.0.0 | ||||||
| self.assertEqual(gcb('\u08E3'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011726'), 'SpacingMark') | ||||||
| # New in 9.0.0 | ||||||
| self.assertEqual(gcb('\u0600'), 'Prepend') | ||||||
| self.assertEqual(gcb('\U000E007F'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011CB4'), 'SpacingMark') | ||||||
| self.assertEqual(gcb('\u200D'), 'ZWJ') | ||||||
| # New in 10.0.0 | ||||||
| self.assertEqual(gcb('\U00011D46'), 'Prepend') | ||||||
| self.assertEqual(gcb('\U00011D47'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011A97'), 'SpacingMark') | ||||||
| # New in 11.0.0 | ||||||
| self.assertEqual(gcb('\U000110CD'), 'Prepend') | ||||||
| self.assertEqual(gcb('\u07FD'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011EF6'), 'SpacingMark') | ||||||
| # New in 12.0.0 | ||||||
| self.assertEqual(gcb('\U00011A84'), 'Prepend') | ||||||
| self.assertEqual(gcb('\U00013438'), 'Control') | ||||||
| self.assertEqual(gcb('\U0001E2EF'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00016F87'), 'SpacingMark') | ||||||
| # New in 13.0.0 | ||||||
| self.assertEqual(gcb('\U00011941'), 'Prepend') | ||||||
| self.assertEqual(gcb('\U00016FE4'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011942'), 'SpacingMark') | ||||||
| # New in 14.0.0 | ||||||
| self.assertEqual(gcb('\u0891'), 'Prepend') | ||||||
| self.assertEqual(gcb('\U0001E2AE'), 'Extend') | ||||||
| # New in 15.0.0 | ||||||
| self.assertEqual(gcb('\U00011F02'), 'Prepend') | ||||||
| self.assertEqual(gcb('\U0001343F'), 'Control') | ||||||
| self.assertEqual(gcb('\U0001E4EF'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011F3F'), 'SpacingMark') | ||||||
| # New in 16.0.0 | ||||||
| self.assertEqual(gcb('\U000113D1'), 'Prepend') | ||||||
| self.assertEqual(gcb('\U0001E5EF'), 'Extend') | ||||||
| self.assertEqual(gcb('\U0001612C'), 'SpacingMark') | ||||||
| self.assertEqual(gcb('\U00016D63'), 'V') | ||||||
| # New in 17.0.0 | ||||||
| self.assertEqual(gcb('\u1AEB'), 'Extend') | ||||||
| self.assertEqual(gcb('\U00011B67'), 'SpacingMark') | ||||||
|
|
||||||
| self.assertRaises(TypeError, gcb) | ||||||
| self.assertRaises(TypeError, gcb, b'x') | ||||||
| self.assertRaises(TypeError, gcb, 120) | ||||||
| self.assertRaises(TypeError, gcb, '') | ||||||
| self.assertRaises(TypeError, gcb, 'xx') | ||||||
|
|
||||||
| def test_indic_conjunct_break(self): | ||||||
| incb = self.db.indic_conjunct_break | ||||||
| self.assertEqual(incb(' '), 'None') | ||||||
| self.assertEqual(incb('x'), 'None') | ||||||
| self.assertEqual(incb('\U0010FFFF'), 'None') | ||||||
| # New in 15.1.0 | ||||||
| self.assertEqual(incb('\u094D'), 'Linker') | ||||||
| self.assertEqual(incb('\u0D4D'), 'Linker') | ||||||
| self.assertEqual(incb('\u0915'), 'Consonant') | ||||||
| self.assertEqual(incb('\u0D3A'), 'Consonant') | ||||||
| self.assertEqual(incb('\u0300'), 'Extend') | ||||||
| self.assertEqual(incb('\U0001E94A'), 'Extend') | ||||||
| # New in 16.0.0 | ||||||
| self.assertEqual(incb('\u034F'), 'Extend') | ||||||
| self.assertEqual(incb('\U000E01EF'), 'Extend') | ||||||
| # New in 17.0.0 | ||||||
| self.assertEqual(incb('\u1039'), 'Linker') | ||||||
| self.assertEqual(incb('\U00011F42'), 'Linker') | ||||||
| self.assertEqual(incb('\u1000'), 'Consonant') | ||||||
| self.assertEqual(incb('\U00011F33'), 'Consonant') | ||||||
| self.assertEqual(incb('\U0001E6F5'), 'Extend') | ||||||
|
|
||||||
| self.assertRaises(TypeError, incb) | ||||||
| self.assertRaises(TypeError, incb, b'x') | ||||||
| self.assertRaises(TypeError, incb, 120) | ||||||
| self.assertRaises(TypeError, incb, '') | ||||||
| self.assertRaises(TypeError, incb, 'xx') | ||||||
|
|
||||||
| def test_extended_pictographic(self): | ||||||
| ext_pict = self.db.extended_pictographic | ||||||
| self.assertIs(ext_pict(' '), False) | ||||||
| self.assertIs(ext_pict('x'), False) | ||||||
| self.assertIs(ext_pict('\U0010FFFF'), False) | ||||||
| # New in 13.0.0 | ||||||
| self.assertIs(ext_pict('\xA9'), True) | ||||||
| self.assertIs(ext_pict('\u203C'), True) | ||||||
| self.assertIs(ext_pict('\U0001FAD6'), True) | ||||||
| self.assertIs(ext_pict('\U0001FFFD'), True) | ||||||
| # New in 17.0.0 | ||||||
| self.assertIs(ext_pict('\u2388'), False) | ||||||
| self.assertIs(ext_pict('\U0001FA6D'), False) | ||||||
|
|
||||||
| self.assertRaises(TypeError, ext_pict) | ||||||
| self.assertRaises(TypeError, ext_pict, b'x') | ||||||
| self.assertRaises(TypeError, ext_pict, 120) | ||||||
| self.assertRaises(TypeError, ext_pict, '') | ||||||
| self.assertRaises(TypeError, ext_pict, 'xx') | ||||||
|
|
||||||
| def test_grapheme_break(self): | ||||||
| def graphemes(*args): | ||||||
| return list(map(str, self.db.iter_graphemes(*args))) | ||||||
|
|
||||||
| self.assertRaises(TypeError, self.db.iter_graphemes) | ||||||
| self.assertRaises(TypeError, self.db.iter_graphemes, b'x') | ||||||
| self.assertRaises(TypeError, self.db.iter_graphemes, 'x', 0, 0, 0) | ||||||
|
|
||||||
| self.assertEqual(graphemes(''), []) | ||||||
| self.assertEqual(graphemes('abcd'), ['a', 'b', 'c', 'd']) | ||||||
| self.assertEqual(graphemes('abcd', 1), ['b', 'c', 'd']) | ||||||
| self.assertEqual(graphemes('abcd', 1, 3), ['b', 'c']) | ||||||
| self.assertEqual(graphemes('abcd', -3), ['b', 'c', 'd']) | ||||||
| self.assertEqual(graphemes('abcd', 1, -1), ['b', 'c']) | ||||||
| self.assertEqual(graphemes('abcd', 3, 1), []) | ||||||
| self.assertEqual(graphemes('abcd', 5), []) | ||||||
| self.assertEqual(graphemes('abcd', 0, 5), ['a', 'b', 'c', 'd']) | ||||||
| self.assertEqual(graphemes('abcd', -5), ['a', 'b', 'c', 'd']) | ||||||
| self.assertEqual(graphemes('abcd', 0, -5), []) | ||||||
| # GB3 | ||||||
| self.assertEqual(graphemes('\r\n'), ['\r\n']) | ||||||
| # GB4 | ||||||
| self.assertEqual(graphemes('\r\u0308'), ['\r', '\u0308']) | ||||||
| self.assertEqual(graphemes('\n\u0308'), ['\n', '\u0308']) | ||||||
| self.assertEqual(graphemes('\0\u0308'), ['\0', '\u0308']) | ||||||
| # GB5 | ||||||
| self.assertEqual(graphemes('\u06dd\r'), ['\u06dd', '\r']) | ||||||
| self.assertEqual(graphemes('\u06dd\n'), ['\u06dd', '\n']) | ||||||
| self.assertEqual(graphemes('\u06dd\0'), ['\u06dd', '\0']) | ||||||
| # GB6 | ||||||
| self.assertEqual(graphemes('\u1100\u1160'), ['\u1100\u1160']) | ||||||
| self.assertEqual(graphemes('\u1100\uAC00'), ['\u1100\uAC00']) | ||||||
| self.assertEqual(graphemes('\u1100\uAC01'), ['\u1100\uAC01']) | ||||||
| # GB7 | ||||||
| self.assertEqual(graphemes('\uAC00\u1160'), ['\uAC00\u1160']) | ||||||
| self.assertEqual(graphemes('\uAC00\u11A8'), ['\uAC00\u11A8']) | ||||||
| self.assertEqual(graphemes('\u1160\u1160'), ['\u1160\u1160']) | ||||||
| self.assertEqual(graphemes('\u1160\u11A8'), ['\u1160\u11A8']) | ||||||
| # GB8 | ||||||
| self.assertEqual(graphemes('\uAC01\u11A8'), ['\uAC01\u11A8']) | ||||||
| self.assertEqual(graphemes('\u11A8\u11A8'), ['\u11A8\u11A8']) | ||||||
| # GB9 | ||||||
| self.assertEqual(graphemes('a\u0300'), ['a\u0300']) | ||||||
| self.assertEqual(graphemes('a\u200D'), ['a\u200D']) | ||||||
| # GB9a | ||||||
| self.assertEqual(graphemes('\u0905\u0903'), ['\u0905\u0903']) | ||||||
| # GB9b | ||||||
| self.assertEqual(graphemes('\u06dd\u0661'), ['\u06dd\u0661']) | ||||||
| # GB9c | ||||||
| self.assertEqual(graphemes('\u0915\u094d\u0924'), | ||||||
| ['\u0915\u094d\u0924']) | ||||||
| self.assertEqual(graphemes('\u0915\u094D\u094D\u0924'), | ||||||
| ['\u0915\u094D\u094D\u0924']) | ||||||
| self.assertEqual(graphemes('\u0915\u094D\u0924\u094D\u092F'), | ||||||
| ['\u0915\u094D\u0924\u094D\u092F']) | ||||||
| # GB11 | ||||||
| self.assertEqual(graphemes( | ||||||
| '\U0001F9D1\U0001F3FE\u200D\u2764\uFE0F' | ||||||
| '\u200D\U0001F48B\u200D\U0001F9D1\U0001F3FC'), | ||||||
| ['\U0001F9D1\U0001F3FE\u200D\u2764\uFE0F' | ||||||
| '\u200D\U0001F48B\u200D\U0001F9D1\U0001F3FC']) | ||||||
| # GB12 | ||||||
| self.assertEqual(graphemes( | ||||||
| '\U0001F1FA\U0001F1E6\U0001F1FA\U0001F1F3'), | ||||||
| ['\U0001F1FA\U0001F1E6', '\U0001F1FA\U0001F1F3']) | ||||||
| # GB13 | ||||||
| self.assertEqual(graphemes( | ||||||
| 'a\U0001F1FA\U0001F1E6\U0001F1FA\U0001F1F3'), | ||||||
| ['a', '\U0001F1FA\U0001F1E6', '\U0001F1FA\U0001F1F3']) | ||||||
|
|
||||||
|
|
||||||
| class Unicode_3_2_0_FunctionsTest(UnicodeFunctionsTest): | ||||||
| db = unicodedata.ucd_3_2_0 | ||||||
|
|
@@ -624,6 +839,11 @@ class Unicode_3_2_0_FunctionsTest(UnicodeFunctionsTest): | |||||
| if quicktest else | ||||||
| 'f217b8688d7bdff31db4207e078a96702f091597') | ||||||
|
|
||||||
| test_grapheme_cluster_break = None | ||||||
| test_indic_conjunct_break = None | ||||||
| test_extended_pictographic = None | ||||||
| test_grapheme_break = None | ||||||
|
|
||||||
|
|
||||||
| class UnicodeMiscTest(unittest.TestCase): | ||||||
| db = unicodedata | ||||||
|
|
@@ -726,6 +946,17 @@ def test_linebreak_7643(self): | |||||
| self.assertEqual(len(lines), 1, | ||||||
| r"%a should not be a linebreak" % c) | ||||||
|
|
||||||
| def test_segment_object(self): | ||||||
| segments = list(unicodedata.iter_graphemes('spa\u0300m')) | ||||||
| self.assertEqual(len(segments), 4, segments) | ||||||
| segment = segments[2] | ||||||
| self.assertEqual(segment.start, 2) | ||||||
| self.assertEqual(segment.end, 4) | ||||||
| self.assertEqual(str(segment), 'a\u0300') | ||||||
| self.assertEqual(repr(segment), '<Segment 2:4>') | ||||||
| self.assertRaises(TypeError, iter, segment) | ||||||
| self.assertRaises(TypeError, len, segment) | ||||||
|
|
||||||
|
|
||||||
| class NormalizationTest(unittest.TestCase): | ||||||
| @staticmethod | ||||||
|
|
@@ -848,5 +1079,61 @@ class MyStr(str): | |||||
| self.assertIs(type(normalize(form, MyStr(input_str))), str) | ||||||
|
|
||||||
|
|
||||||
| class GraphemeBreakTest(unittest.TestCase): | ||||||
| @staticmethod | ||||||
| def check_version(testfile): | ||||||
| hdr = testfile.readline() | ||||||
| return unicodedata.unidata_version in hdr | ||||||
|
|
||||||
| @requires_resource('network') | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should it not be
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe. The other test (for normalization) uses the |
||||||
| def test_grapheme_break(self): | ||||||
| TESTDATAFILE = "auxiliary/GraphemeBreakTest.txt" | ||||||
| TESTDATAURL = f"https://www.unicode.org/Public/{unicodedata.unidata_version}/ucd/{TESTDATAFILE}" | ||||||
|
|
||||||
| # Hit the exception early | ||||||
| try: | ||||||
| testdata = open_urlresource(TESTDATAURL, encoding="utf-8", | ||||||
| check=self.check_version) | ||||||
| except PermissionError: | ||||||
| self.skipTest(f"Permission error when downloading {TESTDATAURL} " | ||||||
| f"into the test data directory") | ||||||
| except (OSError, HTTPException) as exc: | ||||||
| self.skipTest(f"Failed to download {TESTDATAURL}: {exc}") | ||||||
|
|
||||||
| with testdata: | ||||||
| self.run_grapheme_break_tests(testdata) | ||||||
|
|
||||||
| def run_grapheme_break_tests(self, testdata): | ||||||
| for line in testdata: | ||||||
| line, _, comment = line.partition('#') | ||||||
| line = line.strip() | ||||||
| if not line: | ||||||
| continue | ||||||
| comment = comment.strip() | ||||||
|
|
||||||
| chunks = [] | ||||||
| breaks = [] | ||||||
| pos = 0 | ||||||
| for field in line.replace('×', ' ').split(): | ||||||
| if field == '÷': | ||||||
| chunks.append('') | ||||||
| breaks.append(pos) | ||||||
| else: | ||||||
| chunks[-1] += chr(int(field, 16)) | ||||||
| pos += 1 | ||||||
| self.assertEqual(chunks.pop(), '', line) | ||||||
| input = ''.join(chunks) | ||||||
| with self.subTest(line): | ||||||
| result = list(unicodedata.iter_graphemes(input)) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you mean to use the passed
Suggested change
|
||||||
| self.assertEqual(list(map(str, result)), chunks, comment) | ||||||
| self.assertEqual([x.start for x in result], breaks[:-1], comment) | ||||||
| self.assertEqual([x.end for x in result], breaks[1:], comment) | ||||||
| for i in range(1, len(breaks) - 1): | ||||||
| result = list(unicodedata.iter_graphemes(input, breaks[i])) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Continues above.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it is module-only function. |
||||||
| self.assertEqual(list(map(str, result)), chunks[i:], comment) | ||||||
| self.assertEqual([x.start for x in result], breaks[i:-1], comment) | ||||||
| self.assertEqual([x.end for x in result], breaks[i+1:], comment) | ||||||
|
|
||||||
|
|
||||||
| if __name__ == "__main__": | ||||||
| unittest.main() | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of functions in this file doesn’t seem to be alphabetical or topical.
I think another ticket should be created to add a quick links table at the top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can split it on sections by type and order alphabetically inside a section.