Skip to main content

fastertransformer: fastertransformer tf op

Project description

fastertransformer: fastertransformer tf op

https://github.com/NVIDIA/FasterTransformer <br>

libtf_bert.so build for linux os

In NLP, encoder and decoder are two important components, with the transformer layer becoming a popular architecture for both components. 
FasterTransformer implements a highly optimized transformer layer for both the encoder and decoder for inference. On Volta, Turing and Ampere GPUs, 
the computing power of Tensor Cores are used automatically when the precision of the data and weights are FP16.

FasterTransformer is built on top of CUDA, cuBLAS, cuBLASLt and C++. We provide at least one API of the following frameworks: TensorFlow, PyTorch and Triton backend. 
Users can integrate FasterTransformer into these frameworks directly.
For supporting frameworks, we also provide example codes to demonstrate how to use, and show the performance on these frameworks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

fastertransformer-5.0.0.116-py3-none-any.whl (16.9 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page