Skunchala (SK) / Starred

Stars

Transformer related optimization, including BERT, GPT

C++ 1 Updated Jul 8, 2022

Forked from NVIDIA/TensorRT

TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

C++ 1 Updated Jun 24, 2022

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 1 Updated Jul 13, 2022

Forked from apache/tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 1 Updated Jul 13, 2022

Forked from pytorch/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

C++ 1 Updated Jul 13, 2022

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,870 4,761 Updated Mar 22, 2026