Skip to content
View jasonlizhengjian's full-sized avatar

Block or report jasonlizhengjian

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  2. flashinfer flashinfer Public

    Forked from flashinfer-ai/flashinfer

    FlashInfer: Kernel Library for LLM Serving

    Cuda

  3. dynamo dynamo Public

    Forked from ai-dynamo/dynamo

    A Datacenter Scale Distributed Inference Serving Framework

    Rust

  4. TensorRT-LLM TensorRT-LLM Public

    Forked from NVIDIA/TensorRT-LLM

    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

    C++

  5. SafeAILab/zkDL SafeAILab/zkDL Public

    zkDL, an open source toolkit for zero-knowledge proofs of deep learning powered by CUDA

    Cuda 50 1