Skip to content

Workaround for stack overflow in stream refine usage. #20749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

benvanik
Copy link
Collaborator

@benvanik benvanik commented May 7, 2025

On certain models (like that in #19832) kvcache updates end up as massively long sequences of tied dispatches. The attributor-style analysis uses recursion to initialize elements on first use. Tied ops need to walk through the operand to the result and transitively to further uses. When their powers combine we get stack overflows.

There are a few ways around this but this change to avoid update is the simplest. It's not a great solution as it may lead to more iterations as values initialized during ValueResourceUsage update will require at least one iteration to perform their initial update and then one more to quiesce. The best solution would be to avoid recursion by changing the API for querying elements but that's a much larger bit of refactoring. If this does cause issues (like major regressions in compile times) there's probably some other ways that could be explored but here's to hoping no one notices anything.

(filed #20748 to track the history of reverts if one is required here).

On certain models (like that in #19832) kvcache updates end up as
massively long sequences of tied dispatches. The attributor-style
analysis uses recursion to initialize elements on first use. Tied
ops need to walk through the operand to the result and transitively
to further uses. When their powers combine we get stack overflows.

There are a few ways around this but this change to avoid update
is the simplest. It's not a great solution as it may lead to more
iterations as values initialized during ValueResourceUsage update
will require at least one iteration to perform their initial update
and then one more to quiesce. The best solution would be to avoid
recursion by changing the API for querying elements but that's
a much larger bit of refactoring. If this does cause issues (like
major regressions in compile times) there's probably some other ways
that could be explored but here's to hoping no one notices anything.

history of reverts if one is required here).
@benvanik benvanik requested a review from MaheshRavishankar May 7, 2025 22:41
@benvanik benvanik marked this pull request as ready for review May 8, 2025 02:07
@benvanik benvanik enabled auto-merge (squash) May 8, 2025 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant