Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
465492f
feat: draft kernel tuning based on vllm fns
llcnt Dec 8, 2025
c655e07
feat: draft kernel tuning based on vllm fns
llcnt Dec 8, 2025
5112b36
feat: add benchmark fn and saving draft
llcnt Dec 8, 2025
617247d
feat: clean and simplify tuning and config saving
llcnt Dec 18, 2025
30fe4ee
feat: add custom loading fn
llcnt Dec 18, 2025
ef49802
feat: add custom loading fn
llcnt Dec 18, 2025
2e191c7
feat: add unit test
llcnt Dec 19, 2025
35a827d
feat: add vllm dep and upd torch version
llcnt Dec 22, 2025
fb7404b
feat: change smashconfig to save artifacts and reload it
llcnt Dec 22, 2025
9e55dae
fix: adapt parameter names inside smashconfig
llcnt Dec 23, 2025
d008de8
fix: moe intermediate size can differ from model intermediate size
llcnt Dec 23, 2025
9849c5e
feat: ruff linting
llcnt Dec 23, 2025
36f63a7
feat: ty check linting
llcnt Dec 23, 2025
8c53f8c
fix: npdoc space issue
llcnt Dec 24, 2025
15a93ee
fix: minor bugs from review
llcnt Jan 7, 2026
01e3096
feat: adapt tuned configs saving to the new artifact savings
llcnt Feb 9, 2026
3f3bffa
fix: rebase issue adding double commas
llcnt Feb 9, 2026
e779dbb
fix: ruff linting
llcnt Feb 9, 2026
61cc1d2
feat: upd docstrings on moe artifacts
llcnt Feb 9, 2026
dc52abb
feat: add try execpt for ray
llcnt Feb 9, 2026
243e1b6
fix: moe artifacts load fn takes 3 args not 2
llcnt Feb 10, 2026
e940c64
fix: review comment draft
llcnt Feb 17, 2026
12d5b9f
fix: linting
llcnt Feb 17, 2026
b16a44f
feat: ruff on new imports
llcnt Feb 18, 2026
2997fdf
fix: ruff on new import
llcnt Feb 18, 2026
f684372
feat: adapt pruna logger when triton version mismatch
llcnt Feb 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/actions/setup-uv-project/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ runs:
github-token: ${{ github.token }}

- shell: bash
run: uv sync --extra dev
run: uv sync --extra dev --extra vllm
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,10 @@ dependencies = [
]

[project.optional-dependencies]
vllm = [
"vllm>=0.11.0",
"ray",
]
stable-fast = [
"xformers>=0.0.30",
"stable-fast-pruna==1.0.8",
Expand Down
18 changes: 18 additions & 0 deletions src/pruna/algorithms/base/tags.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,24 @@ class AlgorithmTag(Enum):
"Resamplers change the shape of image or video latents during generation to speed up inference.",
)

@classmethod
def tags_compatible_with_moe_kernel(cls) -> list["AlgorithmTag"]:
"""
Return tags that the MoE kernel tuner is compatible with (for ordering).

Used so that compatible_before / compatible_after can be derived from
the enum and stay in sync when new tags are added.
"""
return [
cls.KERNEL,
cls.QUANTIZER,
cls.PRUNER,
cls.CACHER,
cls.FACTORIZER,
cls.BATCHER,
cls.COMPILER,
]

def __init__(self, name: str, description: str):
"""
Initialize an algorithm tag with name and description.
Expand Down
Loading
Loading