Skip to content

Conversation

aleozlx
Copy link
Collaborator

@aleozlx aleozlx commented Oct 4, 2025

📌 Description

trtllm-gen bf16 moe

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).
pytest -x -v tests/moe/test_trtllm_gen_fused_moe.py -k All_BF16

9 passed, 999 skipped

====

pytest  tests/moe/test_trtllm_gen_fused_moe.py

PENDING.. some IMA in existing tests is detected

Reviewer Notes

In the new trtllm_bf16_moe interface, i used * in the argument list to mark which ones should be passed by keyword only. It is a practice to make function calls less error prone when the function has a very long list of arguments. Before * are the ones commonly used, whereas the ones after are optional ones / perf tuning.

@aleozlx
Copy link
Collaborator Author

aleozlx commented Oct 4, 2025

IMA

pytest tests/moe/test_trtllm_gen_fused_moe.py::test_moe_quantization_classes[SwiGlu-Shuffled_MajorK-DSLite-NvFP4xNvFP4-1024-1024-1]

@yzh119
Copy link
Collaborator

yzh119 commented Oct 4, 2025

Hi @aleozlx can we confirm whether this IMA is a kernel issue or an integration issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants