libquant

libquant is a library for low-bit machine learning model quantization, making them more efficient and faster without significant loss in accuracy. It supports various quantization methods, types, and granularities, and provides quantization implementations like native Pytorch, Triton, and CUDA.

Features

Quantization Methods: RTN, GPTQ, AWQ quantization
Quantization Types: weight, activation quantization
Granularity: per-tensor, per-token, per-channel, per-group quantization
Implementations: PyTorch native, Triton, CUDA

Installation

To install libquant from source, run:

git clone https://github.com/ArthurinRUC/libquant.git
cd libquant
pip install -e .

Usage

To start using libquant, just add 2 lines of code to quantize your model.

from transformers import AutoModelForCausalLM
+ from libquant import quant

model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B")
+ quant(model, nbits=8)

You can use QuantArgs to quantize your model with a custom quantization configuration.

from transformers import AutoModelForCausalLM
+ from libquant import quant, QuantArgs

model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B")

+ quant_args = QuantArgs(method="rtn", nbits=8, group_size=128, per_channel=True)

+ quant(model, quant_args)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
libquant		libquant
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

libquant

Features

Installation

Usage

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ArthurinRUC/libquant

Folders and files

Latest commit

History

Repository files navigation

libquant

Features

Installation

Usage

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages