Skip to main content

k-bit optimizers and matrix multiplication routines.

Project description

bitsandbytes

Downloads Downloads Downloads

The bitsandbytes library is a lightweight Python wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and 8 & 4-bit quantization functions.

The library includes quantization primitives for 8-bit & 4-bit operations, through bitsandbytes.nn.Linear8bitLt and bitsandbytes.nn.Linear4bit and 8-bit optimizers through bitsandbytes.optim module.

There are ongoing efforts to support further hardware backends, i.e. Intel CPU + GPU, AMD GPU, Apple Silicon, hopefully NPU.

Please head to the official documentation page:

https://huggingface.co/docs/bitsandbytes/main

License

bitsandbytes is MIT licensed.

We thank Fabio Cannizzo for his work on FastBinarySearch which we use for CPU quantization.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page