Repository of Intel® Neural Compressor

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Introduction to Intel® Neural Compressor

Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool) is an open-source Python library running on Intel CPUs and GPUs, which delivers unified interfaces across multiple deep learning frameworks for popular network compression technologies, such as quantization, pruning, knowledge distillation. This tool supports automatic accuracy-driven tuning strategies to help user quickly find out the best quantized model. It also implements different weight pruning algorithms to generate pruned model with predefined sparsity goal and supports knowledge distillation to distill the knowledge from the teacher model to the student model.

Note

GPU support is under development.

Visit the Intel® Neural Compressor online document website at: https://intel.github.io/neural-compressor.

Infrastructure

Intel® Neural Compressor features an architecture and workflow that aids in increasing performance and faster deployments across infrastructures.

Architecture

Click the image to enlarge it.

Workflow

Click the image to enlarge it.

Supported Frameworks

Supported deep learning frameworks are:

TensorFlow*, including 1.15.0 UP3, 2.7.0, 2.8.0, Official TensorFlow 2.6.2, Official TensorFlow 2.7.0, Official TensorFlow 2.8.0

Note: Intel Optimized TensorFlow 2.5.0 requires to set environment variable TF_ENABLE_MKL_NATIVE_FORMAT=0 before running Neural Compressor quantization or deploying the quantized model.

Note: From the official TensorFlow 2.6.0, oneDNN support has been upstreamed. Download the official TensorFlow 2.6.0 binary for the CPU device and set the environment variable TF_ENABLE_ONEDNN_OPTS=1 before running the quantization process or deploying the quantized model.

PyTorch*, including 1.9.0+cpu, 1.10.0+cpu, 1.11.0+cpu
Apache* MXNet, including 1.6.0, 1.7.0, 1.8.0
ONNX* Runtime, including 1.8.0, 1.9.0, 1.10.0
Execution Engine, a reference bare metal solution(./engine) for domain-specific NLP models.

Installation

Select the installation based on your operating system.

Linux Installation

You can install Neural Compressor using one of three options: Install just the library from binary or source, or get the Intel-optimized framework together with the library by installing the Intel® oneAPI AI Analytics Toolkit.

Prerequisites

The following prerequisites and requirements must be satisfied for a successful installation:

Python version: 3.7 or 3.8 or 3.9 or 3.10
C++ compiler: 7.2.1 or above
CMake: 3.12 or above

common build issues

Issue 1: ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Solution: reinstall pycocotools by "pip install pycocotools --no-cache-dir"

Issue 2: ImportError: libGL.so.1: cannot open shared object file: No such file or directory

Solution: apt install or yum install opencv

Option 1 Install from binary

# install stable version from pip
pip install neural-compressor

# install nightly version from pip
pip install -i https://test.pypi.org/simple/ neural-compressor

# install stable version from from conda
conda install neural-compressor -c conda-forge -c intel

Option 2 Install from source

git clone https://github.com/intel/neural-compressor.git
cd neural-compressor
git submodule sync
git submodule update --init --recursive
pip install -r requirements.txt
python setup.py install

Option 3 Install from AI Kit

The Intel® Neural Compressor library is released as part of the Intel® oneAPI AI Analytics Toolkit (AI Kit). The AI Kit provides a consolidated package of Intel's latest deep learning and machine optimizations all in one place for ease of development. Along with Neural Compressor, the AI Kit includes Intel-optimized versions of deep learning frameworks (such as TensorFlow and PyTorch) and high-performing Python libraries to streamline end-to-end data science and AI workflows on Intel architectures.

The AI Kit is distributed through many common channels, including from Intel's website, YUM, APT, Anaconda, and more. Select and download the AI Kit distribution package that's best suited for you and follow the Get Started Guide for post-installation instructions.

Download AI Kit	AI Kit Get Started Guide

Windows Installation

Prerequisites

The following prerequisites and requirements must be satisfied for a successful installation:

Python version: 3.7 or 3.8 or 3.9
Download and install anaconda.

Create a virtual environment named nc in anaconda:

# Here we install python 3.7 for instance. You can also choose python 3.8 or 3.9.
conda create -n nc python=3.7
conda activate nc

Installation options

Option 1 Install from binary

# install stable version from pip
pip install neural-compressor

# install nightly version from pip
pip install -i https://test.pypi.org/simple/ neural-compressor

# install from conda
conda install pycocotools -c esri   
conda install neural-compressor -c conda-forge -c intel

Option 2 Install from source

git clone https://github.com/intel/neural-compressor.git
cd neural-compressor
git submodule sync
git submodule update --init --recursive
pip install -r requirements.txt
python setup.py install

Documentation

Get Started

APIs explains Intel® Neural Compressor's API.
GUI provides web-based UI service to make quantization easier.
Transform introduces how to utilize Neural Compressor's built-in data processing and how to develop a custom data processing method.
Dataset introduces how to utilize Neural Compressor's built-in dataset and how to develop a custom dataset.
Metric introduces how to utilize Neural Compressor's built-in metrics and how to develop a custom metric.
Objective introduces how to utilize Neural Compressor's built-in objectives and how to develop a custom objective.
Tutorial provides comprehensive instructions on how to utilize Neural Compressor's features with examples.
Examples are provided to demonstrate the usage of Neural Compressor in different frameworks: TensorFlow, PyTorch, MXNet, and ONNX Runtime.
Intel oneAPI AI Analytics Toolkit Get Started Guide explains the AI Kit components, installation and configuration guides, and instructions for building and running sample apps.
AI and Analytics Samples includes code samples for Intel oneAPI libraries.

Deep Dive

Quantization are processes that enable inference and training by performing computations at low-precision data types, such as fixed-point integers. Neural Compressor supports Post-Training Quantization (PTQ) with different quantization capabilities and Quantization-Aware Training (QAT). Note that (Dynamic Quantization) currently has limited support.
Pruning provides a common method for introducing sparsity in weights and activations.
Knowledge Distillation provides a common method for distilling knowledge from teacher model to student model.
Distributed Training introduces how to leverage Horovod to do multi-node training in Intel® Neural Compressor to speed up the training time.
Benchmarking introduces how to utilize the benchmark interface of Neural Compressor.
Mixed precision introduces how to enable mixed precision, including BFP16 and int8 and FP32, on Intel platforms during tuning.
Graph Optimization introduces how to enable graph optimization for FP32 and auto-mixed precision.
Model Conversion introduces how to convert TensorFlow QAT model to quantized model running on Intel platforms.
TensorBoard provides tensor histograms and execution graphs for tuning debugging purposes.

Advanced Topics

Execution Engine is a bare metal solution domain-specific NLP models as the reference for customers.
Adaptor is the interface between components and framework. The method to develop adaptor extension is introduced with ONNX Runtime as example.
Strategy can automatically optimized low-precision recipes for deep learning models to achieve optimal product objectives like inference performance and memory usage with expected accuracy criteria. The method to develop a new strategy is introduced.

Publications

View the full publications list.

System Requirements

Intel® Neural Compressor supports systems based on Intel 64 architecture or compatible processors, specially optimized for the following CPUs:

Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, and Icelake)
future Intel Xeon Scalable processor (code name Sapphire Rapids)

Intel® Neural Compressor requires installing the Intel-optimized framework version for the supported DL framework you use: TensorFlow, PyTorch, MXNet, or ONNX runtime.

Note: Intel Neural Compressor supports Intel-optimized and official frameworks for some TensorFlow versions. Refer to Supported Frameworks for specifics.

Validated Hardware/Software Environment

Processor	OS	Python	Framework	Version
Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, and Icelake)	CentOS 8.3 Ubuntu 18.04	3.7 3.8 3.9	TensorFlow	2.8.0
				2.7.0
				2.6.2
				1.15.0UP3
			PyTorch	1.11.0+cpu
				1.10.0+cpu
				1.9.0+cpu
				IPEX
			MXNet	1.8.0
				1.7.0
				1.6.0
			ONNX Runtime	1.10.0
				1.9.0
				1.8.0

Validated Models

Intel® Neural Compressor provides numerous examples to show the performance gains while minimizing the accuracy loss. A full quantized model list on various frameworks is available in the Model List.

Validated MLPerf Models

Model	Framework	Support	Example
ResNet50 v1.5	TensorFlow	Yes	Link
ResNet50 v1.5	PyTorch	Yes	Link
DLRM	PyTorch	Yes	Link
BERT-large	TensorFlow	Yes	Link
BERT-large	PyTorch	Yes	Link
SSD-ResNet34	TensorFlow	Yes	Link
SSD-ResNet34	PyTorch	Yes	Link
RNN-T	PyTorch	Yes	Link
3D-UNet	TensorFlow	WIP
3D-UNet	PyTorch	Yes	Link

Validated Quantized Models on Intel Xeon Platinum 8380 Scalable processor

Framework	version	model	Accuracy	Performance 1s4c10ins1bs/throughput (samples/sec)
INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]
intel-tensorflow	2.7.0	resnet50v1.0	74.11%	74.27%	-0.22%	1474.03	486.21	3.03x
intel-tensorflow	2.7.0	resnet50v1.5	76.82%	76.46%	0.47%	1224.52	414.92	2.95x
intel-tensorflow	2.7.0	resnet101	77.50%	76.45%	1.37%	853.41	342.71	2.49x
intel-tensorflow	2.7.0	inception_v1	70.48%	69.74%	1.06%	2192.55	1052.98	2.08x
intel-tensorflow	2.7.0	inception_v2	74.36%	73.97%	0.53%	1799.2	816.74	2.20x
intel-tensorflow	2.7.0	inception_v3	77.28%	76.75%	0.69%	923.76	386.05	2.39x
intel-tensorflow	2.7.0	inception_v4	80.40%	80.27%	0.16%	572.49	190.99	3.00x
intel-tensorflow	2.7.0	inception_resnet_v2	80.44%	80.40%	0.05%	265.7	133.92	1.98x
intel-tensorflow	2.7.0	mobilenetv1	71.79%	70.96%	1.17%	3633.27	1382.64	2.63x
intel-tensorflow	2.7.0	mobilenetv2	71.89%	71.76%	0.18%	2504.63	1418.82	1.77x
intel-tensorflow	2.7.0	ssd_resnet50_v1	37.86%	38.00%	-0.37%	68.03	24.68	2.76x
intel-tensorflow	2.7.0	ssd_mobilenet_v1	22.97%	23.13%	-0.69%	866.75	450.34	1.92x
intel-tensorflow	2.7.0	ssd_resnet34	21.69%	22.09%	-1.81%	41.17	10.76	3.83x

Framework	version	model	Accuracy	Performance 1s4c10ins1bs/throughput (samples/sec)
INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]
pytorch	1.9.0+cpu	resnet18	69.57%	69.76%	-0.27%	828.529	402.887	2.06x
pytorch	1.9.0+cpu	resnet50	75.98%	76.15%	-0.21%	515.564	194.381	2.65x
pytorch	1.9.0+cpu	resnext101_32x8d	79.15%	79.31%	-0.20%	203.845	70.247	2.90x
pytorch	1.9.0+cpu	inception_v3	69.43%	69.52%	-0.13%	472.927	216.804	2.18x
pytorch	1.9.0+cpu	peleenet	71.66%	72.10%	-0.62%	513.073	388.098	1.32x
pytorch	1.9.0+cpu	yolo_v3	24.46%	24.54%	-0.36%	99.827	37.337	2.67x
pytorch	1.8.0+cpu	bert_base_sts-b	89.07%	89.76%	-0.77%	179.076	103.044	1.74x
pytorch	1.8.0+cpu	bert_base_sst-2	91.35%	91.83%	-0.52%	179.677	101.563	1.77x
pytorch	1.8.0+cpu	bert_base_rte	69.53%	69.14%	0.56%	176.974	101.55	1.74x
pytorch	1.8.0+cpu	bert_large_rte	72.27%	71.88%	0.54%	37.546	33.779	1.11x
pytorch	1.8.0+cpu	bert_large_mrpc	88.97%	89.91%	-1.05%	86.948	33.841	2.57x
pytorch	1.8.0+cpu	bert_large_qnli	91.54%	91.84%	-0.32%	89.916	33.837	2.66x
pytorch	1.8.0+cpu	bert_large_cola	62.07%	62.83%	-1.21%	87.102	33.964	2.56x

Validated Pruning Models

Tasks	FWK	Model	fp32 baseline	gradient sensitivity with 20% sparsity	+onnx dynamic quantization on pruned model
accuracy%	drop%	perf gain (sample/s)	accuracy%	drop%	perf gain (sample/s)
SST-2	pytorch	bert-base	accuracy = 92.32	accuracy = 91.97	-0.38	1.30x	accuracy = 92.20	-0.13	1.86x
QQP	pytorch	bert-base	[accuracy, f1] = [91.10, 88.05]	[accuracy, f1] = [89.97, 86.54]	[-1.24, -1.71]	1.32x	[accuracy, f1] = [89.75, 86.60]	[-1.48, -1.65]	1.81x

Tasks	FWK	Model	fp32 baseline	Pattern Lock on 70% Unstructured Sparsity	Pattern Lock on 50% 1:2 Structured Sparsity
accuracy%	drop%	accuracy%	drop%
MNLI	pytorch	bert-base	[m, mm] = [84.57, 84.79]	[m, mm] = [82.45, 83.27]	[-2.51, -1.80]	[m, mm] = [83.20, 84.11]	[-1.62, -0.80]
SST-2	pytorch	bert-base	accuracy = 92.32	accuracy = 91.51	-0.88	accuracy = 92.20	-0.13
QQP	pytorch	bert-base	[accuracy, f1] = [91.10, 88.05]	[accuracy, f1] = [90.48, 87.06]	[-0.68, -1.12]	[accuracy, f1] = [90.92, 87.78]	[-0.20, -0.31]
QNLI	pytorch	bert-base	accuracy = 91.54	accuracy = 90.39	-1.26	accuracy = 90.87	-0.73
QnA	pytorch	bert-base	[em, f1] = [79.34, 87.10]	[em, f1] = [77.27, 85.75]	[-2.61, -1.54]	[em, f1] = [78.03, 86.50]	[-1.65, -0.69]

Framework	Model	fp32 baseline	Compression	dataset	acc(drop)%
Pytorch	resnet18	69.76	30% sparsity on magnitude	ImageNet	69.47(-0.42)
Pytorch	resnet18	69.76	30% sparsity on gradient sensitivity	ImageNet	68.85(-1.30)
Pytorch	resnet50	76.13	30% sparsity on magnitude	ImageNet	76.11(-0.03)
Pytorch	resnet50	76.13	30% sparsity on magnitude and post training quantization	ImageNet	76.01(-0.16)
Pytorch	resnet50	76.13	30% sparsity on magnitude and quantization aware training	ImageNet	75.90(-0.30)

Validated Knowledge Distillation Examples

Example Name	Dataset	Student (Accuracy)	Teacher (Accuracy)	Student With Distillation (Accuracy Improvement)
ResNet example	ImageNet	ResNet18 (0.6739)	ResNet50 (0.7399)	0.6845 (0.0106)
ResNet example	ImageNet	ResNet18 (0.6739)	ResNet50 (0.7399)	0.6845 (0.0106)
BlendCnn example	MRPC	BlendCnn (0.7034)	BERT-Base (0.8382)	0.7034 (0)
BlendCnn example	MRPC	BlendCnn (0.7034)	BERT-Base (0.8382)	0.7034 (0)
BiLSTM example	SST-2	BiLSTM (0.7913)	RoBERTa-Base (0.9404)	0.8085 (0.0172)
BiLSTM example	SST-2	BiLSTM (0.7913)	RoBERTa-Base (0.9404)	0.8085 (0.0172)

Validated Engine Examples on Intel Xeon Platinum 8380 Scalable processor

model	Accuracy	Performance 1s4c10ins1bs/throughput (samples/sec)
INT8	FP32	Acc Ratio[(INT8-FP32)/FP32]	INT8	FP32	Preformance Ratio[INT8/FP32]
bert_large_squad	90.7	90.87	-0.19%	45.32	12.53	3.62x
distilbert_base_uncased_sst2	90.14%	90.25%	-0.12%	999.98	283.96	3.52x
minilm_l6_h384_uncased_sst2	89.33%	90.14%	-0.90%	2690.5	1002.7	2.68x
roberta_base_mrpc	89.71%	88.97%	0.83%	508.18	142.48	3.57x
bert_base_nli_mean_tokens_stsb	89.26%	89.55%	-0.32%	504.15	141.5	3.56x

Additional Content

Hiring

We are hiring. Please send your resume to inc.maintainers@intel.com if you have interests in model compression techniques.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

2.5.1

Apr 3, 2024

2.5

Mar 26, 2024

2.4.1

Dec 29, 2023

2.4

Dec 15, 2023

2.3.2

Nov 23, 2023

2.3.1

Sep 28, 2023

2.3

Sep 15, 2023

2.2.1

Jul 21, 2023

2.2

Jun 21, 2023

2.1.1

May 11, 2023

2.1

Mar 29, 2023

2.0

Dec 30, 2022

1.14.2

Nov 1, 2022

1.14.1

Sep 30, 2022

1.14

Sep 19, 2022

1.13.1

Aug 12, 2022

1.13

Jul 26, 2022

1.12

May 27, 2022

This version

1.11

Apr 15, 2022

1.10.1

Feb 28, 2022

1.9.1

Jan 26, 2022

1.9

Jan 3, 2022

1.8.1

Dec 3, 2021

1.8

Nov 21, 2021

1.7.3

Nov 11, 2021

1.7.2

Oct 27, 2021

1.7.1

Oct 22, 2021

1.7

Oct 1, 2021

0.1

Sep 26, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural_compressor-1.11.tar.gz (1.9 MB view hashes)

Uploaded Apr 15, 2022 Source

Built Distributions

neural_compressor-1.11-cp310-cp310-win_amd64.whl (2.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.10 Windows x86-64

neural_compressor-1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (33.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.10 manylinux: glibc 2.17+ x86-64

neural_compressor-1.11-cp39-cp39-win_amd64.whl (2.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.9 Windows x86-64

neural_compressor-1.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (33.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.9 manylinux: glibc 2.17+ x86-64

neural_compressor-1.11-cp38-cp38-win_amd64.whl (2.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.8 Windows x86-64

neural_compressor-1.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (33.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.8 manylinux: glibc 2.17+ x86-64

neural_compressor-1.11-cp37-cp37m-win_amd64.whl (2.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.7m Windows x86-64

neural_compressor-1.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (33.4 MB view hashes)

Uploaded Apr 15, 2022 CPython 3.7m manylinux: glibc 2.17+ x86-64

Hashes for neural_compressor-1.11.tar.gz

Hashes for neural_compressor-1.11.tar.gz
Algorithm	Hash digest
SHA256	`ebc1d83457bd43a32788626c3e988d3b474f142ab09dad01f7bc6114568ac5ab`
MD5	`9fe7f0a73f0a43d67f345556b3a5c372`
BLAKE2b-256	`82c24bad3cfe94f33cd19c84da679401c46728e60d9e37941d5cb0419d2b2f5f`

Hashes for neural_compressor-1.11-cp310-cp310-win_amd64.whl

Hashes for neural_compressor-1.11-cp310-cp310-win_amd64.whl
Algorithm	Hash digest
SHA256	`b06fdaec35d115bb391379811a6bc576ddea3981ad3c802f6d225f7f56d5e4b8`
MD5	`cae380c91f0b389690e46db9492cc175`
BLAKE2b-256	`a322a2079be665066199b87cf9761e1439a14be792a947d212d71546e747ed25`

Hashes for neural_compressor-1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

Hashes for neural_compressor-1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`22ecffe04576ca7e3a730bccb4cdab521b3ea75b5924f903ab2581f6e8f9b869`
MD5	`d8c29c0ac2f55be218682d00b07def79`
BLAKE2b-256	`20dd22dff7b0d97d4944cd110d2ef71e6a0a496c818c7d99d00fe604630307e0`

Hashes for neural_compressor-1.11-cp39-cp39-win_amd64.whl

Hashes for neural_compressor-1.11-cp39-cp39-win_amd64.whl
Algorithm	Hash digest
SHA256	`f6613f97bbb347b21185f4cff2ee3b1ab71c8f4f09d2a8fcd3fe15adcd3676a0`
MD5	`4a42e6750d19620fc3c1a3e66f30141c`
BLAKE2b-256	`5ba00edc6058a76db6a1d84cd57d983f51ccc0d53623429c1356921955955f37`

Hashes for neural_compressor-1.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

Hashes for neural_compressor-1.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`fac6f4feb690c9e29ae0f77c85d7656793173897a323d263f1b04df23d8f5c71`
MD5	`5f2b211cdfa7f250023147caec143bda`
BLAKE2b-256	`5484aaaa34d457373a14ce3578d015fdf00ae5aaed4f38572fbd0808a86c06ec`

Hashes for neural_compressor-1.11-cp38-cp38-win_amd64.whl

Hashes for neural_compressor-1.11-cp38-cp38-win_amd64.whl
Algorithm	Hash digest
SHA256	`3bc16faae91ff1bea33801360f7f2cbfd05177941270b05561cb273a868a6aff`
MD5	`c71533dcb25df9eee6c267664c141087`
BLAKE2b-256	`5c0c2779439a0898beb571c4f1c7297bdd6118743f5644da8ac084db9038a41a`

Hashes for neural_compressor-1.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

Hashes for neural_compressor-1.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`6a68769be8126d980d1308d213d6a0a73368b568eb8645d37cedeb342547cc8f`
MD5	`1c63b07f74e0f818fef915cbdd861dac`
BLAKE2b-256	`296b16f1ea9ab48fe98663bcd3ef16ae48277d0008b4ee7491f6bce8db049bcb`

Hashes for neural_compressor-1.11-cp37-cp37m-win_amd64.whl

Hashes for neural_compressor-1.11-cp37-cp37m-win_amd64.whl
Algorithm	Hash digest
SHA256	`217864bde4c3d317b14eb88ebe2f5a4adb083723a479e83028ae2fc516a7ec25`
MD5	`971de49b50de712260f4f8291944a4c4`
BLAKE2b-256	`fed5c7f54471f903cdb8987d3fbe1e4db6c1db7cbcee9fe9b8d8df6d215a04b0`

Hashes for neural_compressor-1.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

Hashes for neural_compressor-1.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`7e92777719d046e9c295416f912060722dda42f35c9c4ff94b8e2ce16841b27c`
MD5	`7a8e611fdb5484c1fed9516c67213121`
BLAKE2b-256	`76095888c0453214c221bc2fd9d56d8fab18b17fddaa4e6c4fa8fe3871298a33`

neural-compressor 1.11

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Introduction to Intel® Neural Compressor

Infrastructure

Architecture

Workflow

Supported Frameworks

Installation

Linux Installation

Option 1 Install from binary

Option 2 Install from source

Option 3 Install from AI Kit

Windows Installation

Option 1 Install from binary

Option 2 Install from source

Documentation

System Requirements

Validated Hardware/Software Environment

Validated Models

Validated MLPerf Models

Validated Quantized Models on Intel Xeon Platinum 8380 Scalable processor

Validated Pruning Models

Validated Knowledge Distillation Examples

Validated Engine Examples on Intel Xeon Platinum 8380 Scalable processor

Additional Content

Hiring

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions