Skip to main content

An easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow.

Project description

GLU

PyPI Lint Code Base Upload Python Package Code style: black

GitHub stars GitHub followers Twitter Follow

An easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow. This repository allows you to easily make use of the following activation functions:

  • GLU introduced in the paper Language Modeling with Gated Convolutional Networks [1]
  • Bilinear introduced in the paper Language Modeling with Gated Convolutional Networks [1] atrributed to Mnih et al. [2]
  • ReGLU introduced in the paper GLU Variants Improve Transformer [3]
  • GEGLU introduced in the paper GLU Variants Improve Transformer [3]
  • SwiGLU introduced in the paper GLU Variants Improve Transformer [3]
  • SeGLU

Gated Linear Units consist of the component-wise product of two linear projections, one of which is first passed through a sigmoid function. Variations on GLU are possible, using different nonlinear (or even linear) functions in place of sigmoid. In the GLU Variants Improve Transformer [3] paper, in a fine-tuning scenario the new variants seem to produce better perplexities for the de-noising objective used in pre-training, as well as better results on many downstream language-understanding tasks. Furthermore these do not have any apparent computational drawbacks.

Installation

Run the following to install:

pip install glu-tf

Developing glu-tf

To install glu-tf, along with tools you need to develop and test, run the following in your virtualenv:

git clone https://github.com/Rishit-dagli/GLU.git
# or clone your own fork

cd GLU
pip install -e .[dev]

Usage

In this section, I show a minimal example of using the SwiGLU activation function but you can use the other activations in similar manner:

import tensorflow as tf
from glu_tf import SwiGLU

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units=10)
model.add(SwiGLU(bias = False, dim=-1, name='swiglu'))

Want to Contribute 🙋‍♂️?

Awesome! If you want to contribute to this project, you're always welcome! See Contributing Guidelines. You can also take a look at open issues for getting more information about current or upcoming tasks.

Want to discuss? 💬

Have any questions, doubts or want to present your opinions, views? You're always welcome. You can start discussions.

References

[1] Dauphin, Yann N., et al. ‘Language Modeling with Gated Convolutional Networks’. ArXiv:1612.08083 [Cs], Sept. 2017. arXiv.org, http://arxiv.org/abs/1612.08083.

[2] Mnih, A., and Hinton, G. 2007. Three new graphical models for statistical language modelling. In Proceedings of the 24th international conference on Machine learning (pp. 641–648).

[3] Shazeer, Noam. ‘GLU Variants Improve Transformer’. ArXiv:2002.05202 [Cs, Stat], Feb. 2020. arXiv.org, http://arxiv.org/abs/2002.05202.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GLU-tf-0.1.0.tar.gz (8.0 kB view hashes)

Uploaded Source

Built Distribution

GLU_tf-0.1.0-py3-none-any.whl (8.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page