Skip to main content

A modular PyTorch library for vision transformer models

Project description

VFormer

A modular PyTorch library for vision transformers models

Library Features

  • Contains implementations of prominent ViT architectures broken down into modular components like encoder, attention mechanism, and decoder
  • Makes it easy to develop custom models by composing components of different architectures
  • Utilities for visualizing attention using techniques such as gradient rollout

Installation

From source (recommended)

git clone https://github.com/SforAiDl/vformer.git
cd vformer/
python setup.py install

From PyPI

pip install vformer

Models supported

Example usage

To instantiate and use a Swin Transformer model -

import torch
from vformer.models.classification import SwinTransformer

image = torch.randn(1, 3, 224, 224)       # Example data
model = SwinTransformer(
        img_size=224,
        patch_size=4,
        in_channels=3,
        n_classes=10,
        embed_dim=96,
        depths=[2, 2, 6, 2],
        num_heads=[3, 6, 12, 24],
        window_size=7,
        drop_rate=0.2,
    )
logits = model(image)

VFormer has a modular design and allows for easy experimentation using blocks/modules of different architectures. For example, if desired, you can use just the encoder or the windowed attention layer of the Swin Transformer model.

from vformer.attention import WindowAttention

window_attn = WindowAttention(
        dim=128,
        window_size=7,
        num_heads=2,
        **kwargs,
    )
from vformer.encoder import SwinEncoder

swin_encoder = SwinEncoder(
        dim=128,
        input_resolution=(224, 224),
        depth=2,
        num_heads=2,
        window_size=7,
        **kwargs,
    )

Please refer to our documentation to know more.


References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vformer-0.1.3.tar.gz (60.3 kB view hashes)

Uploaded Source

Built Distribution

vformer-0.1.3-py2.py3-none-any.whl (73.9 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page