feat(kernelgen): import NKIPyKernelGen as a subfolder#55
Open
shaojiex-aws wants to merge 1 commit intoaws-neuron:feat/kernelgenfrom
Open
feat(kernelgen): import NKIPyKernelGen as a subfolder#55shaojiex-aws wants to merge 1 commit intoaws-neuron:feat/kernelgenfrom
shaojiex-aws wants to merge 1 commit intoaws-neuron:feat/kernelgenfrom
Conversation
Import the open_source branch of NKIPyKernelGen into `kernelgen/` as a
self-contained subpackage. NKIPyKernelGen is a compiler that traces NumPy
functions and lowers them to NISA (Neuron Instruction Set Architecture)
for AWS Neuron hardware. Users write kernels in Python with `@trace` and
`knob.knob()` annotations; the compiler handles tiling, memory placement,
layout legalization, and NISA lowering.
What's included
---------------
- `kernelgen/nkipy_kernelgen/` — Python tracing frontend:
- `trace.py` (@trace decorator)
- `knob.py` (tensor annotations: mem_space, tile_size, reduction_tile,
partition_dim)
- `traced_array.py` (TracedArray wrapping MLIR SSA values)
- `op_vtable.py` (NumPy op → MLIR lowering table)
- `transforms/nkipy_opt.py` (pipeline orchestration, shells out to
`nkipy-opt`)
- `kernelgen/mlir/` — MLIR dialect + C++ passes:
- `nkipy.annotate` op (target, mem_space, partition_dim, tile_size,
reduction_tile)
- 20+ transformation passes under `mlir/lib/Transforms/` implementing
the 24-pass compilation pipeline (InferLayout, KnobDrivenTiling,
AnnotateMemorySpace, LegalizeLayout, InsertSpillReload,
LinalgToNisa, etc.)
- `kernelgen/tests/` — test suite:
- `passes/` — per-pass FileCheck tests
- `e2e/` — end-to-end tests (trace → NISA → BIR sim / HW)
- `unit/` — Python-level unit tests
- `harness.py` — unified test harness with LLVM/BIR_SIM/HW/FileCheck
modes
- `kernelgen/examples/` — example kernels
- `kernelgen/compiler_explorer/` — Compiler Explorer wrapper for inspecting
IR at any pipeline stage
- `kernelgen/setup.py`, `pyproject.toml`, `pytest.ini`, `requirements.txt`
— build + test configuration (`pip install -e kernelgen/` builds the
C++ passes via CMake)
- `kernelgen/CLAUDE.md`, `README.md` — pipeline docs and usage notes
Architecture notes
------------------
NKIPyKernelGen depends on the NISA dialect defined in private-nki-staging
(the `nki` wheel). NKIPyKernelGen's `nkipy-opt` binary performs the
tensor-level and bufferization phases; lowering to BIR then runs through
the upstream `nki-opt-pipeline`. This import does not bring in the NISA
dialect sources — only NKIPyKernelGen's own passes and frontend.
Ignore rules
------------
Added a `!mlir/lib/` override in `kernelgen/.gitignore` so the parent
nkipy repo's `lib/` rule (intended for Python venv `lib/` dirs) does not
silently exclude the MLIR C++ pass sources under `kernelgen/mlir/lib/`.
Source
------
Imported from NKIPyKernelGen `open_source` branch @ commit 973c1be
("fix: correct mem_space enum values in builder.annotate()"). Internal
git history is not preserved — this is a single squash import for the
open-source release.
aws-zhehongb
approved these changes
May 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Import the open_source branch of NKIPyKernelGen into
kernelgen/as a self-contained subpackage. NKIPyKernelGen is a compiler that traces NumPy functions and lowers them to NISA (Neuron Instruction Set Architecture) for AWS Neuron hardware. Users write kernels in Python with@traceandknob.knob()annotations; the compiler handles tiling, memory placement, layout legalization, and NISA lowering.What's included
kernelgen/nkipy_kernelgen/— Python tracing frontend:trace.py(@trace decorator)knob.py(tensor annotations: mem_space, tile_size, reduction_tile, partition_dim)traced_array.py(TracedArray wrapping MLIR SSA values)op_vtable.py(NumPy op → MLIR lowering table)transforms/nkipy_opt.py(pipeline orchestration, shells out tonkipy-opt)kernelgen/mlir/— MLIR dialect + C++ passes:nkipy.annotateop (target, mem_space, partition_dim, tile_size,reduction_tile)
mlir/lib/Transforms/implementingthe 24-pass compilation pipeline (InferLayout, KnobDrivenTiling,
AnnotateMemorySpace, LegalizeLayout, InsertSpillReload,
LinalgToNisa, etc.)
kernelgen/tests/— test suite:passes/— per-pass FileCheck testse2e/— end-to-end tests (trace → NISA → BIR sim / HW)unit/— Python-level unit testsharness.py— unified test harness with LLVM/BIR_SIM/HW/FileCheckmodes
kernelgen/examples/— example kernelskernelgen/compiler_explorer/— Compiler Explorer wrapper for inspectingIR at any pipeline stage
kernelgen/setup.py,pyproject.toml,pytest.ini,requirements.txt— build + test configuration (
pip install -e kernelgen/builds theC++ passes via CMake)
kernelgen/CLAUDE.md,README.md— pipeline docs and usage notesArchitecture notes
NKIPyKernelGen depends on the NISA dialect defined in private-nki-staging (the
nkiwheel). NKIPyKernelGen'snkipy-optbinary performs the tensor-level and bufferization phases; lowering to BIR then runs through the upstreamnki-opt-pipeline. This import does not bring in the NISA dialect sources — only NKIPyKernelGen's own passes and frontend.Ignore rules
Added a
!mlir/lib/override inkernelgen/.gitignoreso the parent nkipy repo'slib/rule (intended for Python venvlib/dirs) does not silently exclude the MLIR C++ pass sources underkernelgen/mlir/lib/.Source
Imported from NKIPyKernelGen
open_sourcebranch @ commit 973c1be ("fix: correct mem_space enum values in builder.annotate()"). Internal git history is not preserved — this is a single squash import for the open-source release.Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.