Skip to content

PoC: Fast-path for skipping coarse rasterization and scheduling#1454

Draft
laurenz-canva wants to merge 1 commit intomainfrom
laurenz/poc_fast_path
Draft

PoC: Fast-path for skipping coarse rasterization and scheduling#1454
laurenz-canva wants to merge 1 commit intomainfrom
laurenz/poc_fast_path

Conversation

@laurenz-canva
Copy link
Contributor

Note that this was AI-generated, I haven't reviewed this fully in-depth yet and it's possible we can be smarter about the storing of strips, so no nitpicky review please. 😄 But this should demonstrate what I was imagining. And all tests seem to be passing.

As Alex rightly highlighted, this does have the disadvantage of not allowing the "if there is an opaque fill, clear all previous fill" optimization. However, it seems to me like this should be overshadowed by the improvements that come from not doing scheduling and coarse rasterization. Here are the timings for rendering 1000 frames of the GhotScript tiger:

Before (note in particular Wide::generate and Scheduler::do_scene:
image

After:
image

/// This replicates the strip→GpuStrip conversion that normally happens across
/// `Wide::generate` + `Scheduler::do_tile`, but for the simple case where all draws
/// happen at depth=1 directly to the surface with no layers or blending.
pub(crate) fn build_gpu_strips_direct(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably reuse existing code here, as mentioned I haven't cleaned this up yet, just a PoC.

@laurenz-canva laurenz-canva marked this pull request as draft February 18, 2026 10:41
Copy link
Contributor

@taj-p taj-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are parallels with this work and the blit pipeline.

If we go this route, I think we needn't flush the fast path in push layer. We could instead flush the paths that intersect the bounds produced in pop layer. And then, we mightn't even need to depending on the layer type (opacity layer with SRC over blending, for example).

This could also be batched. The fast path can be re-enabled after we pop layer.

In my work on the blit pipeline, there is a batching mechanism you could reuse if we think this is the right strategy to take

@LaurenzV
Copy link
Collaborator

Sounds good, looking forward to the PR! Yes you could probably optimize this even further, but this is the bare minimum that should already be a good improvement in many cases. 😄

@taj-p
Copy link
Contributor

taj-p commented Feb 18, 2026

One thing I didn't state but that I hope is implied: AMAZING to have a POC so quickly to validate the approach. Very cool!!! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments