Skip to content
View ambitiousCC's full-sized avatar
  • Hong Kong
  • 14:03 (UTC +08:00)

Highlights

  • Pro

Block or report ambitiousCC

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ambitiousCC/README.md

Hi there 👋

Pinned Loading

  1. fastllm fastllm Public

    Forked from ztxz16/fastllm

    fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。

    C++

  2. kvcache-ai/ktransformers kvcache-ai/ktransformers Public

    A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

    Python 16.8k 1.2k

  3. chitu chitu Public

    Forked from thu-pacman/chitu

    High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

    Python

  4. sglang sglang Public

    Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Python

  5. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  6. vllm-ascend vllm-ascend Public

    Forked from vllm-project/vllm-ascend

    Community maintained hardware plugin for vLLM on Ascend

    Python