Add softmax impl mode switch and benchmark/test wiring#39
Merged
debashishc merged 2 commits intoinit-harnessfrom Feb 8, 2026
Merged
Add softmax impl mode switch and benchmark/test wiring#39debashishc merged 2 commits intoinit-harnessfrom
debashishc merged 2 commits intoinit-harnessfrom
Conversation
(cherry picked from commit 9eb0b84)
debashishc
added a commit
that referenced
this pull request
Feb 8, 2026
## Summary - fix stale docs references to removed `CLAUDE.md` - correct public API wording in `DEVELOPMENT.md` (`forge_cute_py.ops.<op>()`) - update README benchmark quick reference with softmax benchmark commands and impl-mode notes - clarify current release state in `CONTRIBUTING.md` (`v0.1.0-rc1` exists; final `v0.1.0` pending) - add changelog bullets for softmax impl-mode selection and benchmark CLI updates ## Scope - docs-only patch (`README.md`, `DEVELOPMENT.md`, `ROADMAP.md`, `CHANGELOG.md`, `CONTRIBUTING.md`) - no runtime code changes ## Notes - follow-up to #39
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
FORGE_SOFTMAX_IMPLmode selection tosoftmax_online(auto,ref,kernel)kernelmode, fail fast with clear errors when kernel module/entrypoints are missingautomode contributor-friendly by falling back to the reference implementation--impltobench/benchmark_online_softmax.pyand remove the hard N divisibility assertionbench/run.pyskip softmax cases cleanly when strict kernel mode is unavailableDEVELOPMENT.mdValidation
uv run ruff check forge_cute_py/ops/softmax_online.py tests/test_softmax_online.py bench/benchmark_online_softmax.py bench/run.pyuv run ruff format forge_cute_py/ops/softmax_online.py tests/test_softmax_online.py bench/benchmark_online_softmax.py bench/run.pyuv run pytest tests/test_softmax_online.py -quv run pytest -quv run python bench/benchmark_online_softmax.py --m-sizes 64 --n-sizes 256 --dtypes float16 --warmup 2 --iterations 5 --impl autouv run python bench/benchmark_online_softmax.py --m-sizes 64 --n-sizes 256 --dtypes float16 --warmup 2 --iterations 5 --impl kernelFORGE_SOFTMAX_IMPL=kernel uv run python bench/run.py --suite smoke --op softmax_onlineNotes
jonah <jsamost@gmail.com>(a9d4983)