GitHub - govindansriram/CobraML2: Performant kernels, and other ML Systems integrations

About

A header only lib containing SOTA flash attention kernels, mega kernels, and full model serving. Built from scratch, documenting every step along the way.

Installation

So far all code has only been tested on systems with CUDA >= 12.8 and Ubuntu 22.04.

Build from source

git clone https://github.com/govindansriram/CobraML2.git
cd CobraML2

sudo chmod +x ./runner.sh

You can now build the executables by running:

./runner.sh

And run them using:

./runner.sh -r exe_name

PyTorch Integration

...

Roadmap

MHA
- Iter 1: 287.925 GFLOPs
Flash Attention 1
- Iter 1: 6776.64 GFLOPs
Flash Attention 2
Flash Attention 3
Matmul

Contributing

...

Linting

All files must be formatted to follow the style specified by .clang-format.

Ensure clang-format is installed by running clang-format --version.

Formatting can be applied to all files by running:

./runner.sh -f

Or to a single file:

./runner.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
assets		assets
external		external
include/cobraml2		include/cobraml2
tests		tests
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md
runner.sh		runner.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Installation

Build from source

PyTorch Integration

Roadmap

Contributing

Linting

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

govindansriram/CobraML2

Folders and files

Latest commit

History

Repository files navigation

About

Installation

Build from source

PyTorch Integration

Roadmap

Contributing

Linting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages