feat: sub-pixel text rendering with fixed-point positioning and dual-position glyphs#93
feat: sub-pixel text rendering with fixed-point positioning and dual-position glyphs#93ingpaschke wants to merge 7 commits intojpirnay:masterfrom
Conversation
…han control Replace integer glyph positioning with 12.4 fixed-point accumulation in drawText and getTextAdvanceX for smoother character spacing. Add FP variants getSpaceAdvanceFP/getKerningFP for the layout engine. Add paragraph-level tracking optimization for justified text: when line badness exceeds the natural space width, try 8 tracking values (-4 to +4 FP4 units) and pick the one producing the best line breaks. Per-line tracking is capped at 0.5px/char with overflow flowing back to word gaps. Add widow/orphan control in page layout: orphan prevention forces a page break when only 1 line of a multi-line paragraph fits; widow prevention checks at the second-to-last line to avoid the last line being alone on the next page. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Embed DEFLATE-compressed alternate glyph bitmaps (rendered at 0.5px horizontal offset by FreeType) alongside the primary font data. The renderer selects between primary and alt based on the fractional cursor position: >= 0.5px uses the alt glyph snapped to floor, giving a visual position of floor + 0.5px that closely matches the true cursor. - Extend EpdFontData with altBitmap, altGlyph, altGroups fields - Add --dual-position flag to fontconvert.py - Update FontDecompressor to route alt requests through hot-group cache - Generate dual-position Bookerly fonts for 12pt and 14pt (all styles) - 16pt/18pt regenerated without dual-position to fit flash budget Flash: 89.0% -> 94.6% (+364KB for alt glyph data) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Restore frequency-grouped font path in alt aligned-offset calculation - Fall back to primary glyph when alt bitmap decompression fails - Use public Face.set_transform() API instead of private _FT_Face - Fix Python lint: unused loop variable, f-strings without placeholders Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scale the paragraph tracking cap and exploration range with the natural space width (space_px / 8 in pixels) rather than a fixed 0.5px, giving wider latitude at larger font sizes where tracking is less perceptible. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds optional dual-position glyph assets and selection logic; font generator can emit 0.5px‑shifted alternate glyph data. Text layout switches to 12.4 fixed-point, introduces paragraph-level tracking trials and per-word tracking storage, threads tracking into rendering, and bumps section cache version. Changes
Sequence Diagram(s)sequenceDiagram
participant Layout as ParsedText (layoutAndExtractLines)
participant Renderer as GfxRenderer (drawText)
participant Decompressor as FontDecompressor (getBitmap)
participant FontData as EpdFontData
Layout->>Renderer: drawText(fontId, x, y, text, trackingFP)
Renderer->>Renderer: init cursorFP
loop per codepoint
Renderer->>Renderer: accumulate kerning + advance (12.4 FP)
Renderer->>Renderer: compute fractional bits (cursorFP & 0xF)
alt fractional >= 8 (≥0.5px)
Renderer->>Renderer: useAlt = true
else
Renderer->>Renderer: useAlt = false
end
Renderer->>Decompressor: getBitmap(fontData, glyph, glyphIndex, useAlt)
alt useAlt
Decompressor->>FontData: read altBitmap / altGroups / altGlyph
else
Decompressor->>FontData: read bitmap / groups / glyph
end
Decompressor-->>Renderer: return bitmap
Renderer->>Renderer: render glyph (snap for render)
end
sequenceDiagram
participant Layout as layoutAndExtractLines
participant Justify as Paragraph Tracker
participant Breaks as computeLineBreaksWithHyphenationMeta
participant Extract as extractLine
participant TextBlock as TextBlock ctor
Layout->>Justify: run paragraphTracking trials
loop each trial
Justify->>Breaks: computeLineBreaks with adjusted widths
Breaks-->>Justify: line breaks + hyphenation metadata
Justify->>Justify: evaluateBadness()
end
Justify->>Layout: select best paragraphTracking
Layout->>Extract: extractLine(..., paragraphTracking)
Extract->>TextBlock: construct TextBlock with wordTracking vector
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/GfxRenderer/GfxRenderer.cpp (1)
896-904:⚠️ Potential issue | 🟠 MajorDon’t apply tracking between a base glyph and its combining mark.
cursorFPaddstrackingFPas soon as there is another codepoint in the string, but that next codepoint can be a combining mark. In decomposed text, the accent then gets centered over a tracked position instead of the base glyph, so justified rendering spaces apart a single grapheme cluster.Also applies to: 936-939
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/GfxRenderer/GfxRenderer.cpp` around lines 896 - 904, The issue is that trackingFP is being applied even when the following codepoint is a combining mark, causing the accent to be positioned relative to a tracked cursor instead of the base glyph; update the flow around utf8IsCombiningMark(...) so that when you detect a combining mark (in the blocks around the current snippet and the similar block at 936-939) you do not include trackingFP in cursorFP/lastBaseAdvanceFP calculations used for positioning (i.e., keep lastBaseAdvanceFP and baseSnapX computed from the base glyph advance without adding trackingFP), and ensure combiningMark::centerOver/raiseAboveBase use lastBaseLeft/lastBaseWidth from the untracked base so renderCharImpl places the combining glyph correctly.
🧹 Nitpick comments (1)
lib/EpdFont/scripts/fontconvert.py (1)
920-954: Consider extracting shared bitmap conversion logic.The 4-bit grayscale to 2-bit conversion logic here duplicates lines 214-265 from the primary glyph rendering path. Extracting this into a helper function would reduce maintenance burden and ensure consistent behavior between primary and alternate glyph generation.
♻️ Suggested helper function
def convert_4bit_to_2bit(pixels4g, width, rows): """Convert 4-bit grayscale bitmap to 2-bit packed format.""" pixels2b = [] px = 0 pitch = (width // 2) + (width % 2) for y in range(rows): for x in range(width): px = px << 2 bm = pixels4g[y * pitch + (x // 2)] bm = (bm >> ((x % 2) * 4)) & 0xF if bm >= 12: px += 3 elif bm >= 8: px += 2 elif bm >= 4: px += 1 if (y * width + x) % 4 == 3: pixels2b.append(px) px = 0 if (width * rows) % 4 != 0: px = px << (4 - (width * rows) % 4) * 2 pixels2b.append(px) return pixels2b🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/EpdFont/scripts/fontconvert.py` around lines 920 - 954, The 4-bit-to-2-bit packing logic is duplicated; extract it into a helper like convert_4bit_to_2bit(pixels4g, width, rows) that implements the existing packing loop (compute pitch, iterate y/x, shift/scale 4-bit bm -> 2-bit, pack into bytes and pad tail) and returns pixels2b, then replace the inline conversion in the alternate path (the block building pixels4g -> pixels2b) with a call to that helper, and also replace the identical loop in the primary glyph rendering path (the code around the previous duplicate) to call the same helper so both paths share the same implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/Epub/Epub/ParsedText.cpp`:
- Around line 188-277: The final justified pass bakes paragraphTracking into
wordWidths but later hyphenation/width recomputations (hyphenateWordAtIndex,
measureWordWidth, calculateWordWidths, extractLine) overwrite widths without the
tracking adjustment, causing line break decisions to use untracked widths; fix
by ensuring any place that recomputes or writes wordWidths after
paragraphTracking is chosen reapplies the same tracking expansion (use the same
expansion formula ((charCounts[i]-1)*paragraphTracking + 8) >> 4) or accept a
paragraphTracking parameter so calculateWordWidths/measureWordWidth produce
tracked widths, and keep charCounts available for split prefixes/remainders so
computeLineBreaksWithHyphenationMeta and extractLine always see widths
consistent with paragraphTracking.
In `@lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp`:
- Around line 1504-1510: The change always returning
ParsedText::LineProcessResult::Accepted in the paragraph-collection lambda
disables the page-boundary hyphenation retry used by makePages() and
addLineToPage(..., /*suppressHyphenationRetry=*/true). Restore the original
retry behavior by returning the proper LineProcessResult from the lambda (not
unconditionally Accepted) and ensure makePages()/emission calls to addLineToPage
pass suppressHyphenationRetry=false when they want the final hyphenation retry;
in short, adjust the paragraph collection callback (the lambda that pushes into
paragraphLines) to preserve/forward the original line-processing result
semantics and makePages()/addLineToPage usage so page-boundary hyphenation retry
remains enabled where intended.
- Around line 1515-1540: Before calling completePageFn(...) in the widow/orphan
forced-break branches, push the current xpathParagraphIndex into
paragraphIndexPerPage so the paragraph-index list stays in sync with pages;
specifically, in the orphan-prevention block (the li==0 && isMultiLine branch)
and in the widow-prevention block (li == paragraphLines.size() - 2 branch)
insert the same paragraphIndexPerPage.emplace_back(xpathParagraphIndex) (or
equivalent recording of xpathParagraphIndex) immediately before
completePageFn(std::move(currentPage)), then proceed to increment
completedPageCount and reset currentPage/currentPageNextY as done now.
---
Outside diff comments:
In `@lib/GfxRenderer/GfxRenderer.cpp`:
- Around line 896-904: The issue is that trackingFP is being applied even when
the following codepoint is a combining mark, causing the accent to be positioned
relative to a tracked cursor instead of the base glyph; update the flow around
utf8IsCombiningMark(...) so that when you detect a combining mark (in the blocks
around the current snippet and the similar block at 936-939) you do not include
trackingFP in cursorFP/lastBaseAdvanceFP calculations used for positioning
(i.e., keep lastBaseAdvanceFP and baseSnapX computed from the base glyph advance
without adding trackingFP), and ensure combiningMark::centerOver/raiseAboveBase
use lastBaseLeft/lastBaseWidth from the untracked base so renderCharImpl places
the combining glyph correctly.
---
Nitpick comments:
In `@lib/EpdFont/scripts/fontconvert.py`:
- Around line 920-954: The 4-bit-to-2-bit packing logic is duplicated; extract
it into a helper like convert_4bit_to_2bit(pixels4g, width, rows) that
implements the existing packing loop (compute pitch, iterate y/x, shift/scale
4-bit bm -> 2-bit, pack into bytes and pad tail) and returns pixels2b, then
replace the inline conversion in the alternate path (the block building pixels4g
-> pixels2b) with a call to that helper, and also replace the identical loop in
the primary glyph rendering path (the code around the previous duplicate) to
call the same helper so both paths share the same implementation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 04b8a1b3-8127-474c-8a52-2f025b521aa1
📒 Files selected for processing (28)
lib/EpdFont/EpdFontData.hlib/EpdFont/FontDecompressor.cpplib/EpdFont/FontDecompressor.hlib/EpdFont/builtinFonts/bookerly_12_bold.hlib/EpdFont/builtinFonts/bookerly_12_bolditalic.hlib/EpdFont/builtinFonts/bookerly_12_italic.hlib/EpdFont/builtinFonts/bookerly_12_regular.hlib/EpdFont/builtinFonts/bookerly_14_bold.hlib/EpdFont/builtinFonts/bookerly_14_bolditalic.hlib/EpdFont/builtinFonts/bookerly_14_italic.hlib/EpdFont/builtinFonts/bookerly_14_regular.hlib/EpdFont/builtinFonts/bookerly_16_bold.hlib/EpdFont/builtinFonts/bookerly_16_bolditalic.hlib/EpdFont/builtinFonts/bookerly_16_italic.hlib/EpdFont/builtinFonts/bookerly_16_regular.hlib/EpdFont/builtinFonts/bookerly_18_bold.hlib/EpdFont/builtinFonts/bookerly_18_bolditalic.hlib/EpdFont/builtinFonts/bookerly_18_italic.hlib/EpdFont/builtinFonts/bookerly_18_regular.hlib/EpdFont/scripts/fontconvert.pylib/Epub/Epub/ParsedText.cpplib/Epub/Epub/ParsedText.hlib/Epub/Epub/Section.cpplib/Epub/Epub/blocks/TextBlock.cpplib/Epub/Epub/blocks/TextBlock.hlib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpplib/GfxRenderer/GfxRenderer.cpplib/GfxRenderer/GfxRenderer.h
👮 Files not reviewed due to content moderation or server errors (8)
- lib/EpdFont/builtinFonts/bookerly_16_regular.h
- lib/EpdFont/builtinFonts/bookerly_16_bolditalic.h
- lib/EpdFont/builtinFonts/bookerly_18_bold.h
- lib/EpdFont/builtinFonts/bookerly_18_italic.h
- lib/EpdFont/builtinFonts/bookerly_16_bold.h
- lib/EpdFont/builtinFonts/bookerly_18_bolditalic.h
- lib/EpdFont/builtinFonts/bookerly_18_regular.h
- lib/EpdFont/builtinFonts/bookerly_16_italic.h
- Re-apply paragraphTracking to recomputed widths in hyphenation retry path; charCounts is stale after word merges so recount inline - Preserve suppressHyphenationRetry from layout callback in CollectedLine and pass through to addLineToPage, restoring page-boundary hyphenation retry behaviour - Record paragraphIndexPerPage before completePageFn() in both orphan and widow prevention branches to keep nav metadata in sync - Skip tracking before combining marks in drawText() so accents centre over the base glyph at its untracked position Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp (1)
1505-1509:⚠️ Potential issue | 🟠 MajorPage-boundary hyphenation retry is still bypassed.
Returning
Acceptedfrom the collector meansParsedText::layoutAndExtractLines()never gets a chance to rerender a final hyphenated line, and the lateraddLineToPage()call at Line 1545 drops anyRetryWithoutHyphenationresult on the floor. Bottom-of-page split words will still regress here.Also applies to: 1545-1546
lib/Epub/Epub/ParsedText.cpp (1)
236-247:⚠️ Potential issue | 🟠 MajorTracked widths still go stale after a word splits.
These paths pre-bake
paragraphTrackingintowordWidths, but the subsequent hyphenating breakers can still callhyphenateWordAtIndex(), which rewrites the prefix/remainder widths with natural measurements. After the first split, break selection andextractLine()are again working from untracked widths, so justified paragraphs that hyphenate can still lay out narrower than they render.Also applies to: 268-277, 329-345
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/Epub/Epub/ParsedText.cpp` around lines 236 - 247, The code pre-applies paragraphTracking into wordWidths (savedWidths/adjustedWidths) but hyphenation (hyphenateWordAtIndex / computeLineBreaksWithHyphenationMeta) can rewrite prefix/remainder widths with natural (untracked) measurements, so reapply the paragraphTracking adjustment to any widths produced or mutated by hyphenation before using them for break selection or extractLine; specifically, after creating adjustedWidths from savedWidths/charCounts/tryTracking and after computeLineBreaksWithHyphenationMeta or any call that may invoke hyphenateWordAtIndex, walk the modified prefix/remainder entries in adjustedWidths (and any tryPrefixes/tryInserted results) and add the same tracking expansion logic ((charCounts[i]-1)*tryTracking...) used originally (clamping to >=1) so break selection and extractLine always operate on tracked widths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/Epub/Epub/ParsedText.cpp`:
- Around line 197-207: The charCounts loop (and the similar blocks at the other
locations) currently counts raw UTF-8 codepoints which includes soft hyphens and
combining marks, causing tracking-slot counts to diverge from rendered width;
change the counting to count "tracking slots" instead by advancing with
utf8NextCodepoint(&p) but only incrementing the counter when the codepoint is
neither U+00AD (soft hyphen) nor a combining mark (use the same predicate/logic
used in drawText()/measureWordWidth() to detect combining characters), so
charCounts (and the blocks at the other two sites) reflect the same slots that
measureWordWidth() and drawText() actually render and track.
In `@lib/GfxRenderer/GfxRenderer.cpp`:
- Around line 900-904: Combining marks are using recomputed base metrics instead
of the actual metrics used when useAlt was chosen; update the combining-mark
placement to use the same snapped base position and metrics from the alt glyph
when altGlyph[glyphIndex] was used (i.e. reuse the already-computed
baseSnapX/floor-snapped value and lastBaseLeft/lastBaseWidth/lastBaseTop that
correspond to the rendered base), and replace the fp4::toPixel(rounding)
recomputation with the floor-snapped value used for alt rendering before calling
combiningMark::centerOver and renderCharImpl<TextRotation::None>; ensure the
same change is applied for both the block that computes baseSnapX/combiningX and
the similar block handling the other path (the second occurrence around the
combining-mark handling).
---
Duplicate comments:
In `@lib/Epub/Epub/ParsedText.cpp`:
- Around line 236-247: The code pre-applies paragraphTracking into wordWidths
(savedWidths/adjustedWidths) but hyphenation (hyphenateWordAtIndex /
computeLineBreaksWithHyphenationMeta) can rewrite prefix/remainder widths with
natural (untracked) measurements, so reapply the paragraphTracking adjustment to
any widths produced or mutated by hyphenation before using them for break
selection or extractLine; specifically, after creating adjustedWidths from
savedWidths/charCounts/tryTracking and after
computeLineBreaksWithHyphenationMeta or any call that may invoke
hyphenateWordAtIndex, walk the modified prefix/remainder entries in
adjustedWidths (and any tryPrefixes/tryInserted results) and add the same
tracking expansion logic ((charCounts[i]-1)*tryTracking...) used originally
(clamping to >=1) so break selection and extractLine always operate on tracked
widths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fea7bfff-3965-4ff0-a9f7-76a11be2ebc0
📒 Files selected for processing (3)
lib/Epub/Epub/ParsedText.cpplib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpplib/GfxRenderer/GfxRenderer.cpp
Exclude soft hyphens and combining marks from charCounts — these contribute no tracking slots since measureWordWidth() strips soft hyphens and drawText() skips tracking before combining marks. Applied at both the initial count and the inline retry recount. Track lastBaseSnapX (actual pixel snap used per base glyph) so combining marks use the same floor/round snap as their base rather than recomputing with round-to-nearest, which diverged by 1px when an alt glyph was selected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
lib/Epub/Epub/ParsedText.cpp (1)
973-982:⚠️ Potential issue | 🟡 MinorTracking slot count inconsistent with
charCountslogic.The initial
charCountsat lines 200-207 correctly excludes soft hyphens and combining marks (matchingmeasureWordWidthanddrawTextbehavior). However,totalCharshere counts all codepoints, which will overcount tracking slots for words with soft hyphens or combining marks, resulting in incorrecttrackPerCharFPcalculation.Suggested fix: exclude soft hyphens and combining marks
// Count total characters for tracking distribution - int totalChars = 0; + int totalTrackSlots = 0; for (size_t wi = 0; wi < lineWordCount; wi++) { const auto* p = reinterpret_cast<const unsigned char*>(words[lastBreakAt + wi].c_str()); - while (*p) { utf8NextCodepoint(&p); totalChars++; } + while (*p) { + const uint32_t cp = utf8NextCodepoint(&p); + if (cp != 0x00AD && !utf8IsCombiningMark(cp)) totalTrackSlots++; + } } // drawText applies tracking (Ni - 1) times per word (between chars, not after last). // Total inter-character slots = totalChars - lineWordCount (one less per word). - const int trackSlots = totalChars - static_cast<int>(lineWordCount); + const int trackSlots = totalTrackSlots - static_cast<int>(lineWordCount);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/Epub/Epub/ParsedText.cpp` around lines 973 - 982, The code counts all codepoints into totalChars but charCounts (and functions measureWordWidth/drawText) ignore soft hyphens and combining marks, so compute totalChars the same way: iterate words[lastBreakAt + wi] with utf8NextCodepoint to get each codepoint and increment totalChars only when the codepoint is not a soft hyphen (U+00AD) and not a combining mark; reuse or add the same is-combining check used for charCounts/measureWordWidth, then recompute trackSlots = totalChars - static_cast<int>(lineWordCount) so trackPerCharFP matches charCounts/drawText behavior (reference variables/functions: totalChars, trackSlots, lineWordCount, words, lastBreakAt, utf8NextCodepoint, charCounts, measureWordWidth, drawText).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/Epub/Epub/ParsedText.cpp`:
- Around line 1015-1022: The countChars lambda currently counts every UTF-8
codepoint via utf8NextCodepoint(&p) but must mirror the tracking-slot
advancement logic used elsewhere (which skips soft hyphens and Unicode combining
marks); update countChars (the lambda named countChars in ParsedText.cpp) to
iterate codepoints and only increment the count when the codepoint is not a soft
hyphen and not a combining mark (use the same predicate/utility used by the
tracking logic or replicate that check), ensuring its behavior matches the
tracking-based position advancement used for word positioning.
---
Duplicate comments:
In `@lib/Epub/Epub/ParsedText.cpp`:
- Around line 973-982: The code counts all codepoints into totalChars but
charCounts (and functions measureWordWidth/drawText) ignore soft hyphens and
combining marks, so compute totalChars the same way: iterate words[lastBreakAt +
wi] with utf8NextCodepoint to get each codepoint and increment totalChars only
when the codepoint is not a soft hyphen (U+00AD) and not a combining mark; reuse
or add the same is-combining check used for charCounts/measureWordWidth, then
recompute trackSlots = totalChars - static_cast<int>(lineWordCount) so
trackPerCharFP matches charCounts/drawText behavior (reference
variables/functions: totalChars, trackSlots, lineWordCount, words, lastBreakAt,
utf8NextCodepoint, charCounts, measureWordWidth, drawText).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ff6044e4-edc2-4e44-beec-734e0b9f157d
📒 Files selected for processing (2)
lib/Epub/Epub/ParsedText.cpplib/GfxRenderer/GfxRenderer.cpp
|
|
||
| // Helper: count UTF-8 codepoints in a word | ||
| auto countChars = [](const std::string& w) -> int { | ||
| int n = 0; | ||
| const auto* p = reinterpret_cast<const unsigned char*>(w.c_str()); | ||
| while (*p) { utf8NextCodepoint(&p); n++; } | ||
| return n; | ||
| }; |
There was a problem hiding this comment.
countChars lambda should match tracking-slot logic.
This lambda counts all codepoints, but word positions are advanced based on tracking which skips soft hyphens and combining marks. For consistency with lines 200-207 and correct position calculation:
Suggested fix
auto countChars = [](const std::string& w) -> int {
int n = 0;
const auto* p = reinterpret_cast<const unsigned char*>(w.c_str());
- while (*p) { utf8NextCodepoint(&p); n++; }
+ while (*p) {
+ const uint32_t cp = utf8NextCodepoint(&p);
+ if (cp != 0x00AD && !utf8IsCombiningMark(cp)) n++;
+ }
return n;
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Helper: count UTF-8 codepoints in a word | |
| auto countChars = [](const std::string& w) -> int { | |
| int n = 0; | |
| const auto* p = reinterpret_cast<const unsigned char*>(w.c_str()); | |
| while (*p) { utf8NextCodepoint(&p); n++; } | |
| return n; | |
| }; | |
| // Helper: count UTF-8 codepoints in a word | |
| auto countChars = [](const std::string& w) -> int { | |
| int n = 0; | |
| const auto* p = reinterpret_cast<const unsigned char*>(w.c_str()); | |
| while (*p) { | |
| const uint32_t cp = utf8NextCodepoint(&p); | |
| if (cp != 0x00AD && !utf8IsCombiningMark(cp)) n++; | |
| } | |
| return n; | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/Epub/Epub/ParsedText.cpp` around lines 1015 - 1022, The countChars lambda
currently counts every UTF-8 codepoint via utf8NextCodepoint(&p) but must mirror
the tracking-slot advancement logic used elsewhere (which skips soft hyphens and
Unicode combining marks); update countChars (the lambda named countChars in
ParsedText.cpp) to iterate codepoints and only increment the count when the
codepoint is not a soft hyphen and not a combining mark (use the same
predicate/utility used by the tracking logic or replicate that check), ensuring
its behavior matches the tracking-based position advancement used for word
positioning.
In makePages(), the dispatch loop ignored addLineToPage's return value. When addLineToPage returned RetryWithoutHyphenation, it exited early without pushing the line to the page — silently dropping content. Since the collect-then-dispatch flow cannot redo layout, retry with lineEndsWithHyphenatedWord=false to bypass the guard and accept the line as-is. Trade-off: one hyphenated word may stay split at the page boundary, which matches jpirnay upstream behavior in the same corner. Fixes flickering reported with 'The Secret History of Mac Gaming': dropped lines caused non-deterministic page counts between cache write and re-layout, making the reader toggle between two renderings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Just providing some context here based on my evaluation of upstream PR #1662. The comment below is directly from the other PR. Important caveats:
After using this for a few days, I think it's a regression from what's in main.
Yes, that's true. And we need to work within those constraints. The fundamental problem with this PR is that the kerning choice for any given pair of letters depends not only on the letters theselves, but also where those letters happen to fall in the accumulated X position across the line. This reverts exactly what #1413 was fixing. Glyphs must snap to a pixel, yes. But if we pick the kerning snap including accumulated fixed-point X position error across the line, we get inconsistencies in common kerning pairs. Here are a number of examples to compare:
@ingpaschke Can you please provide specific examples of what looks smoother in rendering with this PR versus master? If it's about the right-edge alignment with justified text, maybe we'd be better with something like this patch: |


















Summary
This PR brings sub-pixel text rendering quality improvements for justified text, originally developed for the upstream CrossPoint repo (PR #1676).
I came across CrossPoint++ while looking at forks and thought the rendering improvements would complement your focus on reading quality. I've verified it works on top of your master — feel free to adapt or trim as you see fit.
What's included
Notes
Summary by CodeRabbit
New Features
Improvements
Chores