Skip to content

Comments

vello_hybrid: Remove command annotation logic#1438

Open
laurenz-canva wants to merge 13 commits intolinebender:mainfrom
laurenz-canva:laurenz/wtile_cmd
Open

vello_hybrid: Remove command annotation logic#1438
laurenz-canva wants to merge 13 commits intolinebender:mainfrom
laurenz-canva:laurenz/wtile_cmd

Conversation

@laurenz-canva
Copy link
Contributor

@laurenz-canva laurenz-canva commented Feb 13, 2026

This PR is stacked on top of #1436, so that should land first.

It removes the command annotation logic that is needed to handle blending correctly, and instead moves the bookkeeping into vello_common during coarse rasterization. Basically, we add a new Start command that tells us whether the original surface is a target of blending operations, and each PushBuf command has a similar flag. In vello_hybrid, we simply need to query this flag and add some additional processing if the root layer is the target of blending. I've considered putting the logic behind the const generic of Wide, but I think it's probably not worth as it makes the code less straightforward, and I think the performance overhead is negligible.

While not as big of a performance improvement as I had hoped, it's still something:
Before:
image

After:
image

@LaurenzV LaurenzV marked this pull request as draft February 13, 2026 11:35
# Conflicts:
#	sparse_strips/vello_hybrid/src/schedule.rs
@LaurenzV LaurenzV marked this pull request as ready for review February 13, 2026 13:36
Copy link
Contributor

@ajakubowicz-canva ajakubowicz-canva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!
LGTM if looks good to @grebmeg

///
/// TODO: Can be triggered via a const generic on coarse draw cmd generation which will avoid
/// a linear scan.
fn prepare_cmds(cmds: &[Cmd], state: &mut SchedulerState) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I frequently see this function show up in browser performance profiles, so I wouldn't be surprised if we see non-trivial performance improvements in Wasm 🎉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., this frame took 300 microseconds in graphics. 100 of which are in this function!

Image


let next_round: bool = depth.is_multiple_of(2) && depth > 2;
let round = nos.round.max(tos.round + usize::from(next_round));
#[inline(always)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to do inline always compared to just inline?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense if we really want to be sure, but both would probably work in this case. Just did it because of Andrew's suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I was just thinking if there are cases were a compiler might not want to inline it because it might be faster not to?

wide.pop_buf();

assert_eq!(wide.cmds.len(), 4);
assert_eq!(wide.cmds.len(), 5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we test that the Start cmd is actually the first entry?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable 👍

# Conflicts:
#	sparse_strips/vello_common/src/coarse.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants