Skip to content

refactor(matmul): reorganize kernel directory structure (Issue #122)#132

Merged
m96-chan merged 1 commit intomainfrom
feature/issue-122-kernel-directory-refactor
Dec 30, 2025
Merged

refactor(matmul): reorganize kernel directory structure (Issue #122)#132
m96-chan merged 1 commit intomainfrom
feature/issue-122-kernel-directory-refactor

Conversation

@m96-chan
Copy link
Copy Markdown
Owner

Summary

Implements Option 2 from Issue #122: explicit naming convention w{weight}a{act}_{out}/ for clearer kernel identification.

Directory Mapping

GEMM:

Old Path New Path Description
gemm/fp8/bf16/ gemm/w8a16_bf16/ FP8 weight, BF16 activation
gemm/fp8/fp8/ gemm/w8a8_bf16/ Pure FP8
gemm/fp8/f32/ gemm/w8a8_f32/ FP8 -> F32 output
gemm/nvf4/bf16/ gemm/w4a16_bf16/ NVF4 weight
gemm/bf16/bf16/ gemm/bf16_bf16/ No quantization
gemm/f32/f32/ gemm/f32_f32/ Baseline
gemm/int8/int8/ gemm/int8_int8/ Int8
gemm/int4/int4/ gemm/int4_int4/ Int4

GEMV:

Old Path New Path Description
gemv/bf16/bf16/nvf4* gemv/w4a16_bf16/ NVF4 weight GEMV
gemv/bf16/bf16/fp8* gemv/w8a16_bf16/ FP8 weight GEMV
gemv/fp8/fp8/ gemv/w8a8_bf16/ Pure FP8 GEMV
gemv/nvf4/nvf4/ gemv/w4a4_bf16/ NVF4 x NVF4 GEMV
gemv/bf16/bf16/ gemv/bf16_bf16/ No quantization
gemv/int4/int4/ gemv/int4_int4/ Int4

Changes

  • Reorganized 46 kernel files using git mv (history preserved)
  • Updated CMakeLists.txt with new paths
  • Updated include paths in source files
  • Updated CLAUDE.md with new naming convention documentation

Naming Convention

The new convention w{weight}a{act}_{out} makes kernel purpose immediately clear:

  • w8a16_bf16: FP8 weights (8-bit), BF16 activations (16-bit), BF16 output
  • w4a16_bf16: NVF4 weights (4-bit), BF16 activations, BF16 output
  • w8a8_bf16: FP8 weights, FP8 activations, BF16 output
  • bf16_bf16: BF16 weights, BF16 activations (no quantization)

Test Plan

  • Build passes (SM 120a, CUDA 13.1)
  • Pre-commit checks pass (ruff, mypy)
  • CI tests pass

Closes #122

🤖 Generated with Claude Code

Implements Option 2 from Issue #122: explicit naming convention
w{weight}a{act}_{out}/ for clearer kernel identification.

Directory mapping:
- gemm/fp8/bf16/ -> gemm/w8a16_bf16/ (FP8 weight, BF16 activation)
- gemm/fp8/fp8/ -> gemm/w8a8_bf16/ (pure FP8)
- gemm/nvf4/bf16/ -> gemm/w4a16_bf16/ (NVF4 weight)
- gemv/bf16/bf16/nvf4* -> gemv/w4a16_bf16/
- gemv/bf16/bf16/fp8* -> gemv/w8a16_bf16/
- gemv/fp8/fp8/ -> gemv/w8a8_bf16/
- gemv/nvf4/nvf4/ -> gemv/w4a4_bf16/

Updated:
- CMakeLists.txt with new paths
- Include paths in source files
- CLAUDE.md with new naming convention docs

Build verified: SM 120a CUDA 13.1 SUCCESS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@m96-chan m96-chan merged commit 43e89c2 into main Dec 30, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: Reorganize kernel directory structure for clarity

1 participant