This repository contains code to run faster sentence-transformers using tools like quantization, ONNX and pruning.
Project description
Fast Sentence Transformers
This repository contains code to run faster sentence-transformers
using tools like quantization and ONNX
. Just run your model much faster, while a lot of memory. There is not much to it!
Install
pip install fast-sentence-transformers
Or for GPU support.
pip install fast-sentence-transformers[gpu]
Quickstart
from fast_sentence_transformers import FastSentenceTransformer as SentenceTransformer
# use any sentence-transformer
encoder = SentenceTransformer("all-MiniLM-L6-v2", device="cpu", quantize=True)
encoder.encode("Hello hello, hey, hello hello")
encoder.encode(["Life is too short to eat bad food!"] * 2)
Benchmark
Indicative benchmark for CPU usage with smallest and largest model on sentence-transformers
. Note, ONNX doesn't have GPU support for quantization yet.
model | Type | default | ONNX | ONNX+quantized | ONNX+GPU |
---|---|---|---|---|---|
paraphrase-albert-small-v2 | memory | 1x | 1x | 1x | 1x |
speed | 1x | 2x | 5x | 20x | |
paraphrase-multilingual-mpnet-base-v2 | memory | 1x | 1x | 4x | 4x |
speed | 1x | 2x | 5x | 20x |
Shout-Out
This package heavily leans on sentence-transformers
and txtai
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for fast_sentence_transformers-0.3.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86b78bb5c4a64edb6b8459b13bf0dfe6e682c418f9b515912c329ef7e665a468 |
|
MD5 | a9e2a6d6bfa0224ff03838ad5d71c7b3 |
|
BLAKE2b-256 | 67a6e9437edbf6bafbb2db99cb419db783fa06a89f82daffb2d71089d5106bb5 |
Close
Hashes for fast_sentence_transformers-0.3.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63cacf0fe7e3bc5c51b240bb214ad3014d78ce56244fc23270168f20d6d0c00c |
|
MD5 | 340bd5a63ce721ac607eaab9694ce238 |
|
BLAKE2b-256 | 8f9ad82351dd589a5418f02da0b348dad54c77d1aca4a2db6aa228259e1975c9 |