jan-karsten-kuhnke (Jan Karsten Kuhnke) · GitHub

Pinned Loading

DATASET_CREATION-Evolved_Self_Intruct-evol-dataset DATASET_CREATION-Evolved_Self_Intruct-evol-dataset Public

Forked from theblackcat102/evol-dataset

evol augment any dataset online

Python
QUANTIZATION-AWQ-AutoAWQ QUANTIZATION-AWQ-AutoAWQ Public

Forked from casper-hansen/AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.

C++ 1
QUANTIZATION-GGUF-llama.cpp QUANTIZATION-GGUF-llama.cpp Public

Forked from ggml-org/llama.cpp

Port of Facebook's LLaMA model in C/C++

C 1
QUANTIZATION-GPTQ-exllama QUANTIZATION-GPTQ-exllama Public

Forked from turboderp/exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Python
QUANTIZATION-GPTQ-AutoGPTQ QUANTIZATION-GPTQ-AutoGPTQ Public

Forked from AutoGPTQ/AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python
QUANTIZATION-LORA-QLORA-bitsandbytes QUANTIZATION-LORA-QLORA-bitsandbytes Public

Forked from bitsandbytes-foundation/bitsandbytes

8-bit CUDA functions for PyTorch

Python