A memory allocator for PyTorch to allow using more memory than the iGPU reserved

These details have not been verified by PyPI

Project links

repository

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

pytorch_rocm_gtt

Python package to allow PyTorch ROCm to overcome the reserved iGPU memory limits.

Based on https://github.com/pomoke/torch-apu-helper/tree/main, after discussion here: https://github.com/ROCm/ROCm/issues/2014

About

If you need to run machine learning workload, you usually need a bunch of GPU memory to hold your PyTorch tensors and your model. Ryzen APU (integrated GPUs) are usually quite good, and more recent versions of ROCm added support to some of those Radeon integrated graphics, but a major limitation is the amount of VRAM usually reserved to those GPUs.

But the VRAM used by those GPUs is actually shared with system memory, so there is no real reason for PyTorch to be limited by the reserved memory only: it could potentially use the whole system RAM memory anyway.

This package patches pytorch at runtime, allowing it to allocate more memory than what is currently reserved in system BIOS for the integrated card.

All you need is ROCm and drivers properly installed (check AMD documentation), a pip install pytorch_rocm_gtt and a pytorch_rocm_gtt.patch() call in the begining of your script (thanks, @segurac!).

Install it from PyPI

pip install pytorch_rocm_gtt

Usage

Just call this before starting pytorch allocations (model or torch):

import pytorch_rocm_gtt

pytorch_rocm_gtt.patch()

hipcc command should be in your $PATH.

After that, just allocate GPU memory as you would with cuda:

import torch

torch.rand(1000).to("cuda")

Compatibility

In order to use this package, your APU must be compatible with ROCm in the first place.

Check AMD documentation on how to install ROCm for your distribution.

Docker images

We have pre-built images based on ROCm images, but also including the new memory allocator.

You can check the list of available images in DockerHub.

For example, to run a python shell with ROCm 6.0.2, PyTorch 2.1.2 and the unbounded memory allocator, run this shell command:

$ docker run --rm -it \
  --cap-add=SYS_PTRACE \
  --security-opt seccomp=unconfined \
  --device=/dev/kfd \
  --device=/dev/dri \
  --group-add video \
  --ipc=host \
  --shm-size 8G \
  -e HSA_OVERRIDE_GFX_VERSION=11.0.1 \
  pappacena/rocm-pytorch:rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2 \
  python

Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pytorch_rocm_gtt
>>> pytorch_rocm_gtt.patch()
>>> import torch
>>> torch.rand(1000, 1000).to("cuda")
tensor([[0.3428, 0.3032, 0.7657,  ..., 0.1255, 0.3866, 0.3153],
        [0.9015, 0.3409, 0.8885,  ..., 0.4413, 0.4961, 0.9245],
        [0.3883, 0.2388, 0.7439,  ..., 0.0647, 0.6922, 0.9496],
        ...,
        [0.4221, 0.7197, 0.5481,  ..., 0.5292, 0.7475, 0.3166],
        [0.1787, 0.9987, 0.7080,  ..., 0.8570, 0.3217, 0.1324],
        [0.6306, 0.0611, 0.1979,  ..., 0.1404, 0.4922, 0.2805]],
       device='cuda:0')

Development

Read the CONTRIBUTING.md file.

How to release

Update pyproject.toml file with the desired version, and run make release to create the new tag.

After that, the github action will publish to pypi.

Once it is published, run the docker_build_and_publish.sh <version-number> script to update the docker images.

Project details

These details have not been verified by PyPI

Project links

repository

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.4

Apr 2, 2024

0.1.2

Apr 2, 2024

0.1.1

Apr 2, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_rocm_gtt-0.1.4.tar.gz (4.0 kB view hashes)

Uploaded Apr 2, 2024 Source

Built Distribution

pytorch_rocm_gtt-0.1.4-py3-none-any.whl (5.1 kB view hashes)

Uploaded Apr 2, 2024 Python 3

Hashes for pytorch_rocm_gtt-0.1.4.tar.gz

Hashes for pytorch_rocm_gtt-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`200ee6c4b83e8729a4bb21c968c8aed5e8641ff70e8ce6d97691f71db0597480`
MD5	`619bc85466610ce44b66a2c5cca690c8`
BLAKE2b-256	`df736cf0cb6f36bfb1851d257d381f44a3425914cc652e2855875e462e464ae3`

Hashes for pytorch_rocm_gtt-0.1.4-py3-none-any.whl

Hashes for pytorch_rocm_gtt-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d2740b52eac57b15b76a89136294d1c7842c12b4756124c69244f81a65719f41`
MD5	`88a56bfe82d27e567459d7dc09461109`
BLAKE2b-256	`8c48029f6db6c202d42e130d85e5ccc7d6be77cc413b39af589c473f82dc1457`