Revert "Rename triton function" by LoserCheems · Pull Request #269 · HKUSTDial/flash-sparse-attention

LoserCheems · 2026-04-21T09:33:41Z

Reverts #268

Copilot

Pull request overview

Reverts prior Triton module/function rename by switching call sites back to the *_fwd.py / *_bwd.py module naming, and reintroducing dedicated forward-combine and backward pre/post-processing modules used by the attention kernels.

Changes:

Update imports in tests and Triton interface to use flash_*_{fwd,bwd} modules.
Replace previous combine/preprocess/postprocess module references with flash_fwd_combine, flash_bwd_preprocess, and flash_bwd_postprocess.
Add new Triton kernels/modules for forward split-KV combine and backward preprocess/postprocess.

Reviewed changes

Copilot reviewed 8 out of 11 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/test_utils.py	Updates test imports to the reverted `_fwd`/`_bwd` module names.
flash_sparse_attn/ops/triton/interface.py	Updates interface imports to the reverted module names.
flash_sparse_attn/ops/triton/flash_sparse_fwd.py	Switches split-KV combine call to `flash_fwd_combine`.
flash_sparse_attn/ops/triton/flash_sparse_bwd.py	Switches backward pre/post steps to `flash_bwd_preprocess`/`flash_bwd_postprocess`.
flash_sparse_attn/ops/triton/flash_gated_fwd.py	Switches split-KV combine call to `flash_fwd_combine`.
flash_sparse_attn/ops/triton/flash_gated_bwd.py	Switches backward pre/post steps to `flash_bwd_preprocess`/`flash_bwd_postprocess`.
flash_sparse_attn/ops/triton/flash_dense_fwd.py	Switches split-KV combine call to `flash_fwd_combine`.
flash_sparse_attn/ops/triton/flash_dense_bwd.py	Switches backward pre/post steps to `flash_bwd_preprocess`/`flash_bwd_postprocess`.
flash_sparse_attn/ops/triton/flash_fwd_combine.py	Adds Triton kernel + wrapper to combine split-KV partial outputs/LSE into final output/LSE.
flash_sparse_attn/ops/triton/flash_bwd_preprocess.py	Adds Triton kernel + wrapper to compute `dpsum`, `lse_log2`, and initialize `dq_accum`.
flash_sparse_attn/ops/triton/flash_bwd_postprocess.py	Adds Triton kernel + wrapper to scale/store `dq` (and optional `da`) from accumulators.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Revert "Rename triton function"

529d324

Copilot AI review requested due to automatic review settings April 21, 2026 09:33

LoserCheems merged commit 59fce4f into main Apr 21, 2026
3 checks passed

Copilot started reviewing on behalf of LoserCheems April 21, 2026 09:34 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Rename triton function"#269

Revert "Rename triton function"#269
LoserCheems merged 1 commit intomainfrom
revert-268-optim-combine-func

LoserCheems commented Apr 21, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LoserCheems commented Apr 21, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants