parley: make builders emit indexed style runs directly, with tree style-id reuse#559
Conversation
|
This can be viewed on a commit by commit basis. This is groundwork for improved style sharing with #516. |
|
@nicoburns You might like the changes here to the tree builder. |
b7f6709 to
975f9d2
Compare
xStrom
left a comment
There was a problem hiding this comment.
Overall this looks like a good step towards more efficient memory usage. I do have a few inline questions though.
| for style in &normalized.styles { | ||
| if let Some(index) = canonical_styles | ||
| .iter() | ||
| .position(|existing| existing == style) | ||
| { | ||
| remap.push(index as u16); | ||
| } else { | ||
| let index = canonical_styles.len() as u16; | ||
| canonical_styles.push(style.clone()); | ||
| remap.push(index); | ||
| } | ||
| } |
There was a problem hiding this comment.
So right now only the TreeBuilder deduplicates styles and RangedBuilder still generates duplicate styles?
I can understand canonicalizing the sort order of styles, however deduplication is beyond what I would consider canonicalization and is closer to hiding issues from the test.
I know there's also #516 in flight etc, so I'd like to better understand what the state is here and what the plans are. Because conceptually I think both builders could be efficient with style reuse. Maybe not as part of this PR, but then I'd like to at least document that this canonicalize_layout_data is a temporary hack to work around that issue.
There was a problem hiding this comment.
The tree-builder does not de-duplicate strictly speaking. It re-uses from the same nesting level as you pop / push.
The ranged builder does not do that ... I had changes where I used a HashMap to actually de-duplicate and the overhead came out to roughly zero ... but I didn't want to introduce that level of change here (and a canonicalization would still be required for this test because the de-duplicated structure would still be different).
The test still ensures that the end result of the built structure has the same styles now, so the goal of the test is being met.
People who want the best scenario will end up using the future StyleRunBuilder which allows the caller to feed things through and handle de-duplication on their own.
#516 is my own PR and that's what led to me submitting this one ... and some other stuff that isn't ready at all where I am working on a model of styled_text which also de-duplicates a lot of stuff at the higher levels.
Move layout building to a shared internal representation of: - a style table (resolved styles) - style runs (byte ranges + style ids) Motivation: - establish a single internal shape that all builders can target - keep analysis and shaping aligned on indexed style data - improve phase boundaries by keeping style normalization in build code (analysis no longer mutates style input state) - prepare for future fast paths (for example, precomputed style-id runs) This commit intentionally does not add style interning; it changes the architecture only. Ranged/Tree inputs are lowered into style-table/style-run state on LayoutContext and then consumed by analysis + shaping. parley: make builders prepare style runs before build_into_layout
parley: lower ranged styles directly into style table and runs
975f9d2 to
91bf57b
Compare
Refactors Parley’s style-lowering pipeline so builder outputs are consumed in a single indexed representation (style_table + style_runs) and removes redundant intermediate style state.
End state vs main
TreeBuilderandRangedBuilderboth lower directly into indexed style data used by analysis/shaping/layout.LayoutContext::stylesintermediate state is removed.TreeStyleBuildernow reuses style IDs when returning to previously active tree nodes (for example "A B C" reuses the same outer style entry for A and C).RangedStyleBuilderlowers directly to style table/runs and uses reusable scratch storage to avoid per-build temporary allocations.Motivation