The CUTLASS notes series will begin with a minimal GEMM implementation, gradually expand to incorporate CuTe and various CUTLASS components, as well as features of new architectures, e.g. Hopper and Blackwell, ultimately achieving a high-performance fused GEMM operator.
git clone https://github.com/ArthurinRUC/cutlass-notes.git
# clone cutlass
cd cutlass-notes
git submodule update --init --recursiveAll example code in this GitHub repository can be compiled and run by simply executing the Python script. For example:
cd 01-minimal-gemm
python minimal_gemm.py| Notes | Summary | Links |
|---|---|---|
| 00-Intro | Brief introduction to CUTLASS | intro |
| 01-minimal-gemm | minimal-gemm | |
| 02-mixed-precision-gemm | mixed-precision-gemm | |
| 03-tiled-mma | tiled-mma | |
| 04-tiled-copy | Coming soon | Stay tuned |
This project is licensed under the MIT License - see the LICENSE file for details.