π Add canvas-based text measurement and dirty-flag dimension caching#9
π Add canvas-based text measurement and dirty-flag dimension caching#9williamchong wants to merge 3 commits intomasterfrom
Conversation
Layer 1 β Dirty-flag caching in IframeView.expand(): - Cache textWidth/textHeight results, only re-measure when content changes (RESIZE/EXPAND events or setLayout calls) - Pre-populate cache from resizeCheck() to avoid double measurement - Speculative expand() in next()/prev() becomes a no-op automatically Layer 2 β TextMeasurer utility (src/utils/text-measurer.ts): - Canvas-based text measurement via measureText() with zero DOM reflow - Intl.Segmenter for word boundaries (CJK, Thai), space/char fallback - Font-keyed width cache, per-parent getComputedStyle cache - Exotic CSS detection (letter-spacing, word-spacing) with fallback Layer 3 β Canvas-optimized Mapping: - Binary search on pre-measured cumulative widths in findTextStartRange and findTextEndRange, reducing O(N) per-word reflows to O(1) - Reuses node bounds from findStart/findEnd to avoid redundant reflow - Falls back to DOM Range loop when canvas path can't verify position No impact on Node.js entry point β TextMeasurer is only imported by browser-only modules. Requires OffscreenCanvas (Chrome 69+, Safari 16.4+) and Intl.Segmenter (Chrome 87+, Safari 15.4+) with automatic fallbacks for older browsers.
- Cap _widthCache at 32 font entries to prevent unbounded growth
- Handle getContext("2d") returning null (OffscreenCanvas fallback)
- Fix canvas fast path to span full segment instead of single char
- Rename findOffsetAtPosition β findSegmentIndex (returns index)
- Add WeakMap<Text, PreparedNode> index for O(1) node lookup
- Replace O(N) .find() in _canvasPrepare with getPreparedNode()
- Fix stale JSDoc claiming prepare() returns null
- Revert README browser versions to unversioned
- Add 6 expand() caching tests (dirty flag, horizontal/vertical)
- Add 16 TextMeasurer unit tests (segmentation, binary search, etc.)
src/utils/text-measurer.ts
Outdated
| for (let i = 0; i < text.length; i++) { | ||
| const ch = text[i]!; | ||
| if (ch === " ") { | ||
| if (current) { | ||
| result.push({ text: current, index: currentStart }); | ||
| } | ||
| result.push({ text: " ", index: i }); | ||
| current = ""; | ||
| currentStart = i + 1; | ||
| } else if (CJK_RE.test(ch)) { | ||
| if (current) { | ||
| result.push({ text: current, index: currentStart }); | ||
| current = ""; | ||
| } | ||
| result.push({ text: ch, index: i }); | ||
| currentStart = i + 1; | ||
| } else { | ||
| if (!current) currentStart = i; | ||
| current += ch; | ||
| } |
There was a problem hiding this comment.
segmentText() fallback iterates with for (let i = 0; i < text.length; i++) and text[i], which operates on UTF-16 code units. This means astral-plane CJK characters (e.g. U+20000+) wonβt be treated as a single character, and CJK_RE.test(ch) wonβt match surrogate halves, producing incorrect segmentation and offsets for those inputs. Consider iterating by code points (e.g. for...of with a separate code-unit index) so both the segmentation and index values remain correct for DOM Range offsets.
| for (let i = 0; i < text.length; i++) { | |
| const ch = text[i]!; | |
| if (ch === " ") { | |
| if (current) { | |
| result.push({ text: current, index: currentStart }); | |
| } | |
| result.push({ text: " ", index: i }); | |
| current = ""; | |
| currentStart = i + 1; | |
| } else if (CJK_RE.test(ch)) { | |
| if (current) { | |
| result.push({ text: current, index: currentStart }); | |
| current = ""; | |
| } | |
| result.push({ text: ch, index: i }); | |
| currentStart = i + 1; | |
| } else { | |
| if (!current) currentStart = i; | |
| current += ch; | |
| } | |
| // Iterate by Unicode code points, but track UTF-16 indices for DOM Range offsets | |
| let i = 0; | |
| for (const ch of text) { | |
| if (ch === " ") { | |
| if (current) { | |
| result.push({ text: current, index: currentStart }); | |
| } | |
| result.push({ text: " ", index: i }); | |
| current = ""; | |
| currentStart = i + ch.length; | |
| } else if (CJK_RE.test(ch)) { | |
| if (current) { | |
| result.push({ text: current, index: currentStart }); | |
| current = ""; | |
| } | |
| result.push({ text: ch, index: i }); | |
| currentStart = i + ch.length; | |
| } else { | |
| if (!current) currentStart = i; | |
| current += ch; | |
| } | |
| i += ch.length; |
src/mapping.ts
Outdated
| const nodeStart = this.horizontal | ||
| ? (this.direction === "rtl" ? nodePos.right : nodePos.left) | ||
| : nodePos.top; | ||
| const relativeTarget = this.direction === "rtl" ? nodeStart - target : target - nodeStart; |
There was a problem hiding this comment.
In _canvasFindRange(), relativeTarget is reversed whenever direction === "rtl", even when this.horizontal is false. For vertical mapping with RTL languages, this makes relativeTarget negative and disables the canvas fast path unnecessarily. Consider only applying the RTL reversal when this.horizontal is true (and otherwise always use target - nodeStart).
| const relativeTarget = this.direction === "rtl" ? nodeStart - target : target - nodeStart; | |
| const relativeTarget = this.horizontal | |
| ? (this.direction === "rtl" ? nodeStart - target : target - nodeStart) | |
| : target - nodeStart; |
| invalidate(root: Element): void { | ||
| this._preparedCache.delete(root); | ||
| } |
There was a problem hiding this comment.
invalidate() only deletes _preparedCache, but _nodeIndex still retains prepared entries for text nodes. Since Mapping._canvasPrepare() first checks getPreparedNode(), invalidation wonβt actually force a re-prepare and can return stale segment widths/offsets after DOM changes. Consider removing _nodeIndex entries for all PreparedNodes under this root (e.g., look up the cached PreparedNode[] first and delete their .node keys) or rebuilding _nodeIndex per-root so invalidation is effective.
| destroy(): void { | ||
| this._widthCache.clear(); | ||
| this._ctx = null; | ||
| this._canvas = null; | ||
| this._segmenter = null; | ||
| } |
There was a problem hiding this comment.
destroy() clears the width cache and canvas refs, but leaves _preparedCache and _nodeIndex intact. After destroy(), getPreparedNode() can still return stale PreparedNode data, and cached PreparedNode arrays can keep text nodes alive longer than intended. Consider resetting _preparedCache/_nodeIndex to new WeakMaps (or otherwise clearing them) as part of destroy.
- Iterate by code point (for...of) in segmentText() fallback to handle astral-plane CJK characters (U+20000+) correctly with UTF-16 offsets - Only apply RTL reversal in _canvasFindRange when horizontal, so vertical layouts with RTL languages don't get negative relativeTarget - invalidate() now clears _nodeIndex entries for all PreparedNodes under the root, so _canvasPrepare re-prepares after DOM changes - destroy() resets _preparedCache and _nodeIndex to new WeakMaps
Layer 1 β Dirty-flag caching in IframeView.expand():
Layer 2 β TextMeasurer utility (src/utils/text-measurer.ts):
Layer 3 β Canvas-optimized Mapping:
No impact on Node.js entry point β TextMeasurer is only imported by browser-only modules. Requires OffscreenCanvas (Chrome 69+, Safari 16.4+) and Intl.Segmenter (Chrome 87+, Safari 15.4+) with automatic fallbacks for older browsers.