Stars
Transformer related optimization, including BERT, GPT
Skunchala / TensorRT
Forked from NVIDIA/TensorRTTensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
Skunchala / onnxruntime
Forked from microsoft/onnxruntimeONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Skunchala / tvm
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Skunchala / pytorch
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.