Rethinking Residual Errors in Compensation-based LLM Quantization [ICLR'26]

Motivation

We identify the missing residual error term ( We name it 'Compensation-aware Error' ) in GPTAQ, which comes the discrepancy between compensated and original weights. Therefore, our method strictly aligns with the original full-precision ouput at each column.

Scripts

Our codebase is heavily relied on GPTAQ, with simple modifications. Please see fake_quant/gptaq_utils_r.py for details.

Take weight-only quantization as an example:

cd fake_quant bash weight_group_3bit.sh ### per-group quantization bash weight_group_2bit.sh ### Quarot + per-group quantization

Todo List

I'm continuously working on improving the stability of ResComp. Since we form a more precise optimization objective, and we only use 128 samples, it may be more sensitive to the quality of calibration data. I warmly welcome further discussions, feel free to contact me (list@zju.edu.cn).

Acknowledgements

Our codebase is built heavily on previous works, and we would like to acknowledge and thank their awesome contribution:

GPTAQ: Efficient finetuning-free quantization for asymmetric calibration github
GPTQ: Accurate post-training quantization for generative pre-trained transformers github
QuaRot: Outlier-free 4-bit inference in rotated llms github
SpinQuant: Llm quantization with learned rotations github

Citation

If you find our work useful in your research, please kindly cite this paper:

@inproceedings{lirethinking, title={Rethinking Residual Errors in Compensation-based LLM Quantization}, author={Li, Shuaiting and Deng, Juncan and Xu, Kedong and Deng, Rongtao and Gu, Hong and Jiang, Minghan and Shen, Haibin and Huang, Kejie}, booktitle={The Fourteenth International Conference on Learning Representations} }

Besides, if you are interested in vector quantization, checkout our previous papers: SSVQ [ICCV'25], MVQ [ASPLOS'25], VQ4DiT [AAAI'25], ViM-VQ [ICCV'25]. I'm also seeking collabration opportunity in CUDA kernel optimization to better support SSVQ.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
fake_quant		fake_quant
imgs		imgs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rethinking Residual Errors in Compensation-based LLM Quantization [ICLR'26]

Motivation

Scripts

Todo List

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Rethinking Residual Errors in Compensation-based LLM Quantization [ICLR'26]

Motivation

Scripts

Todo List

Acknowledgements

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages