Skip to content

Add vsc-unicode-natural sort order (VS Code / Windows Explorer style)#246

Open
siakun wants to merge 1 commit intoSebastianMC:masterfrom
siakun-testing:feature/vsc-unicode-natural
Open

Add vsc-unicode-natural sort order (VS Code / Windows Explorer style)#246
siakun wants to merge 1 commit intoSebastianMC:masterfrom
siakun-testing:feature/vsc-unicode-natural

Conversation

@siakun
Copy link
Copy Markdown

@siakun siakun commented Apr 21, 2026

Summary

Adds a new sort order token vsc-unicode-natural (alias: unicode-charcode-natural) that combines the existing vsc-unicode ordering with numeric-aware digit-run comparison. This mimics the default file sort behavior of VS Code and Windows Explorer.

Motivation

Users whose system locale is non-Latin (e.g. ko-KR, ja-JP) currently face a gap between the two closest existing options:

Option Latin before CJK? Natural number sort?
< a-z ❌ CJK-first in many locales
< vsc-unicode Part 10 < Part 2
< vsc-unicode-natural (new)

Concrete example from a Korean-locale vault with a book folder containing Part 0 through Part 8, 부록 A, 부록 B, 부록 C:

  • With < a-z부록 A, 부록 B, 부록 C, Part 0, ... (CJK first)
  • With < vsc-unicodePart 0, ..., Part 8, 부록 A, ... (ok for now) but Part 10 would sort before Part 2
  • With < vsc-unicode-natural → correct Windows/VSCode order, scales to Part 10+

Implementation

Uses + 'Intl.Collator' + (' + 'en'+, { numeric: true, sensitivity: 'base' })`:

  • Fixed en locale so collation is consistent across user machines; CJK stays after Latin via UCA defaults
  • numeric: true enables natural digit-run comparison
  • sensitivity: 'base' matches the existing alphabetical comparator's case/accent-insensitivity
  • Punctuation/symbols precede letters via UCA weights, so [TODO] correctly sorts before CLAUDE (matches VS Code behavior)
  • Uses sortStringWithExt so files with identical basenames but different suffixes (e.g. + 'name (variant).md' + vs + 'name.md' + ) sort the way VS Code orders them

Changes

  • CustomSortOrder enum: add vscUnicodeNatural / vscUnicodeNaturalReverse
  • custom-sort.ts: add CollatorCompareVscNatural and two Sorters entries
  • sorting-spec-processor.ts: register vsc-unicode-natural and unicode-charcode-natural tokens. Important: these are placed before the existing vsc-unicode / unicode-charcode entries because the parser uses a startsWith match on Object.keys(OrderLiterals) in declaration order, otherwise vsc-unicode-natural is partially matched as vsc-unicode with -natural as trailing garbage
  • sorting-spec-processor.spec.ts: add a parser recognition test that parallels the existing vsc-unicode coverage

Tests

All 831 existing tests pass, plus the new test for vsc-unicode-natural / unicode-charcode-natural token recognition.

Example usage

---
sorting-spec: |-
  target-folder: /*
  /folders
    < vsc-unicode-natural
  /:files
    < vsc-unicode-natural
---

Notes

  • No changes to manifest.json, package.json, or versions.json since those are release-time concerns
  • Backward compatible: existing sortspecs using < vsc-unicode or < a-z continue to work unchanged
  • Happy to adjust token naming (e.g. vsc-unicode-numeric), add more tests, or split into smaller PRs if preferred

Introduces a new sort order token 'vsc-unicode-natural' (alias:
'unicode-charcode-natural') which combines the existing vsc-unicode
ordering with numeric-aware digit-run comparison. This mimics the
default file sort behavior of VS Code and Windows Explorer.

Implementation uses Intl.Collator with a fixed 'en' locale, base
sensitivity, and numeric mode:

- Punctuation and symbols precede letters via UCA weights
  (e.g. "[TODO]" < "CLAUDE")
- Digit runs compared numerically, not lexically
  (e.g. "Part 2" < "Part 10")
- Latin precedes CJK scripts regardless of the user's system locale
  (e.g. "Part" < "부록"), which is the main gap vs plain 'a-z'
- Base sensitivity (case-insensitive, accent-insensitive)
- Extension-inclusive comparison via sortStringWithExt, matching
  VS Code's behavior for files with identical basenames

This addresses a limitation where users on non-Latin system locales
(e.g. ko-KR) see CJK-first ordering when using 'a-z', and where
'vsc-unicode' alone lacks natural-number sorting so "Part 10" < "Part 2".

Changes:
- CustomSortOrder enum: vscUnicodeNatural / vscUnicodeNaturalReverse
- custom-sort.ts: CollatorCompareVscNatural + Sorters entries
- sorting-spec-processor.ts: register 'vsc-unicode-natural' and
  'unicode-charcode-natural' tokens; place them before 'vsc-unicode'
  so the startsWith-based matcher does not pick the shorter name first
- tests: add parser recognition test paralleling existing vsc-unicode
  coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant