This repository contains code to run faster sentence-transformers using tools like quantization, ONNX and pruning.
Project description
Fast Sentence Transformers
This repository contains code to run faster sentence-transformers
using tools like quantization and ONNX
. Just run your model much faster, while reducing memory a lot of memory. There is not much to it!
Quickstart
from fast_sentence_transformers import FastSentenceTransformer as SentenceTransformer
# use any sentence-transformer
encoder = SentenceTransformer("all-MiniLM-L6-v2", device=-1, quantize=True)
encoder.encode("Hello hello, hey, hello hello")
encoder.encode(["Life is too short to eat bad food!"] * 2)
Benchmark
Indicative benchmark for CPU usage with smallest and largest model on sentence-transformers
.
model | Type | default | ONNX | ONNX+quantized |
---|---|---|---|---|
paraphrase-albert-small-v2 | memory | 1x | 1x | 1x |
speed | 1x | 2x | 5x | |
paraphrase-multilingual-mpnet-base-v2 | memory | 1x | 1x | 4x |
speed | 1x | 2x | 5x |
Shout-Out
This package heavily leans on sentence-transformers
and txtai
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for fast-sentence-transformers-0.2.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5641fce84989513b92e3a4e1fccf95931fe49c90f88b244c4e2c8c44067f80f |
|
MD5 | eeffdc7a52db55737e10e894538a3a0d |
|
BLAKE2b-256 | 9b6c03758494f9a651631d0831c1df23d0a06f0df59c9ff6739c62f6a00b1284 |
Close
Hashes for fast_sentence_transformers-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3568f4e3c16f5e4b5ddc496bf222d647eeea87b80c7f02c24236bf9b04380019 |
|
MD5 | fb19a7dfa47bdcaeb9e6fcb362584d35 |
|
BLAKE2b-256 | f83251c51dd1cd8044af9091eba5f8493f24a10ff2f53023ef896d11b509e8bb |