🦆 QuACK: A Quirky Assortment of CuTe Kernels 🦆

Kernels are written in the CuTe-DSL.

Installation

# For CUDA 12.9: pip install quack-kernels # For CUDA 13.1: pip install 'quack-kernels[cu13]' --extra-index-url https://download.pytorch.org/whl/cu130 # Or using uv (faster): uv pip install 'quack-kernels[cu13]' # Optional: install NVIDIA matmul heuristics for better untuned GEMM configs pip install 'quack-kernels[heuristics]'

Requirements

H100 or B200/B300 GPU
CUDA toolkit 12.9+
Python 3.12

Kernels 🐥

🦆 RMSNorm forward + backward
🦆 Softmax forward + backward
🦆 Cross entropy forward + backward
🦆 Layernorm forward
🦆 Hopper gemm + epilogue
🦆 Blackwell gemm + epilogue

Usage

from quack import rmsnorm, softmax, cross_entropy

Documentations

[2025-07-10] We have a comprehensive blogpost on how to get memory-bound kernels to speed-of-light, right in the comfort of Python thanks to the CuTe-DSL.

Performance

See our blogpost for the details.

Development

To set up the development environment:

pip install -e '.[dev]' pre-commit install # For CUDA 13.1: pip install 'quack-kernels[dev,cu13]' --extra-index-url https://download.pytorch.org/whl/cu130 # Or using uv: uv pip install 'quack-kernels[dev,cu13]'

Name		Name	Last commit message	Last commit date
Latest commit History 498 Commits
.github		.github
AI		AI
benchmarks		benchmarks
docs		docs
media		media
quack		quack
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦆 QuACK: A Quirky Assortment of CuTe Kernels 🦆

Installation

Requirements

Kernels 🐥

Usage

Documentations

Performance

Development

About

Uh oh!

Releases 30

Packages

Uh oh!

Contributors 24

Languages

Folders and files

Latest commit

History

Repository files navigation

🦆 QuACK: A Quirky Assortment of CuTe Kernels 🦆

Installation

Requirements

Kernels 🐥

Usage

Documentations

Performance

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 30

Packages 0

Uh oh!

Contributors 24

Languages

Packages