tensorflow / model-optimization Star 1.6k Code Issues Pull requests A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning. machine-learning sparsity compression deep-learning tensorflow optimization keras ml pruning quantization model-compression quantized-training quantized-neural-networks quantized-networks Updated Dec 1, 2025 Python
google / qkeras Star 577 Code Issues Pull requests QKeras: a quantization deep learning library for Tensorflow Keras machine-learning fpga deep-learning tensorflow accelerator keras quantization hardware-acceleration fpga-accelerator quantized-neural-networks asic-design quantized-networks Updated Jun 13, 2025 Python
bytedance / ABQ-LLM Star 238 Code Issues Pull requests An acceleration library that supports arbitrary bit-width combinatorial quantization operations research cuda mlsys quantized-networks llm-inference Updated Sep 30, 2024 C++
HuangCongQing / model-compression-optimization Star 18 Code Issues Pull requests model compression and optimization for deployment for Pytorch, including knowledge distillation, quantization and pruning.(知识蒸馏,量化,剪枝) sparsity pytorch pruning quantization nas knowledge-distillation model-compression sparsity-optimization quantized-networks Updated Sep 10, 2024 Python