Skip to main content

Manage calls to calloc/free through Cython

Project description

cymem: A Cython Memory Helper
********************

cymem provides two small memory-management helpers for Cython. They make it
easy to tie memory to a Python object's life-cycle, so that the memory is freed
when the object is garbage collected.

.. image:: https://img.shields.io/travis/explosion/cymem/master.svg?style=flat-square&logo=travis
:target: https://travis-ci.org/explosion/cymem

.. image:: https://img.shields.io/appveyor/ci/explosion/cymem/master.svg?style=flat-square&logo=appveyor
:target: https://ci.appveyor.com/project/explosion/cymem
:alt: Appveyor Build Status

.. image:: https://img.shields.io/pypi/v/cymem.svg?style=flat-square
:target: https://pypi.python.org/pypi/cymem
:alt: pypi Version

.. image:: https://img.shields.io/conda/vn/conda-forge/cymem.svg?style=flat-square
:target: https://anaconda.org/conda-forge/cymem
:alt: conda Version

.. image:: https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true&style=flat-square&logo=python&logoColor=white
:target: https://github.com/explosion/wheelwright/releases
:alt: Python wheels

Overview
========

The most useful is ``cymem.Pool``, which acts as a thin wrapper around the calloc
function:

.. code:: python

from cymem.cymem cimport Pool
cdef Pool mem = Pool()
data1 = <int*>mem.alloc(10, sizeof(int))
data2 = <float*>mem.alloc(12, sizeof(float))

The ``Pool`` object saves the memory addresses internally, and frees them when the
object is garbage collected. Typically you'll attach the ``Pool`` to some cdef'd
class. This is particularly handy for deeply nested structs, which have
complicated initialization functions. Just pass the ``Pool`` object into the
initializer, and you don't have to worry about freeing your struct at all —
all of the calls to ``Pool.alloc`` will be automatically freed when the ``Pool``
expires.

Installation
============

Installation is via `pip <https://pypi.python.org/pypi/pip>`_, and requires `Cython <http://cython.org/>`_.

.. code:: bash

pip install cymem

Example Use Case: An array of structs
=====================================

Let's say we want a sequence of sparse matrices. We need fast access, and
a Python list isn't performing well enough. So, we want a C-array or C++
vector, which means we need the sparse matrix to be a C-level struct — it
can't be a Python class. We can write this easily enough in Cython:

.. code:: cython

"""Example without Cymem

To use an array of structs, we must carefully walk the data structure when
we deallocate it.
"""

from libc.stdlib cimport calloc, free

cdef struct SparseRow:
size_t length
size_t* indices
double* values

cdef struct SparseMatrix:
size_t length
SparseRow* rows

cdef class MatrixArray:
cdef size_t length
cdef SparseMatrix** matrices

def __cinit__(self, list py_matrices):
self.length = 0
self.matrices = NULL

def __init__(self, list py_matrices):
self.length = len(py_matrices)
self.matrices = <SparseMatrix**>calloc(len(py_matrices), sizeof(SparseMatrix*))

for i, py_matrix in enumerate(py_matrices):
self.matrices[i] = sparse_matrix_init(py_matrix)

def __dealloc__(self):
for i in range(self.length):
sparse_matrix_free(self.matrices[i])
free(self.matrices)


cdef SparseMatrix* sparse_matrix_init(list py_matrix) except NULL:
sm = <SparseMatrix*>calloc(1, sizeof(SparseMatrix))
sm.length = len(py_matrix)
sm.rows = <SparseRow*>calloc(sm.length, sizeof(SparseRow))
cdef size_t i, j
cdef dict py_row
cdef size_t idx
cdef double value
for i, py_row in enumerate(py_matrix):
sm.rows[i].length = len(py_row)
sm.rows[i].indices = <size_t*>calloc(sm.rows[i].length, sizeof(size_t))
sm.rows[i].values = <double*>calloc(sm.rows[i].length, sizeof(double))
for j, (idx, value) in enumerate(py_row.items()):
sm.rows[i].indices[j] = idx
sm.rows[i].values[j] = value
return sm


cdef void* sparse_matrix_free(SparseMatrix* sm) except *:
cdef size_t i
for i in range(sm.length):
free(sm.rows[i].indices)
free(sm.rows[i].values)
free(sm.rows)
free(sm)


We wrap the data structure in a Python ref-counted class at as low a level as
we can, given our performance constraints. This allows us to allocate and free
the memory in the ``__cinit__`` and ``__dealloc__`` Cython special methods.

However, it's very easy to make mistakes when writing the ``__dealloc__`` and
``sparse_matrix_free`` functions, leading to memory leaks. cymem prevents you from
writing these deallocators at all. Instead, you write as follows:

.. code:: cython

"""Example with Cymem.

Memory allocation is hidden behind the Pool class, which remembers the
addresses it gives out. When the Pool object is garbage collected, all of
its addresses are freed.

We don't need to write MatrixArray.__dealloc__ or sparse_matrix_free,
eliminating a common class of bugs.
"""
from cymem.cymem cimport Pool

cdef struct SparseRow:
size_t length
size_t* indices
double* values

cdef struct SparseMatrix:
size_t length
SparseRow* rows


cdef class MatrixArray:
cdef size_t length
cdef SparseMatrix** matrices
cdef Pool mem

def __cinit__(self, list py_matrices):
self.mem = None
self.length = 0
self.matrices = NULL

def __init__(self, list py_matrices):
self.mem = Pool()
self.length = len(py_matrices)
self.matrices = <SparseMatrix**>self.mem.alloc(self.length, sizeof(SparseMatrix*))
for i, py_matrix in enumerate(py_matrices):
self.matrices[i] = sparse_matrix_init(self.mem, py_matrix)

cdef SparseMatrix* sparse_matrix_init_cymem(Pool mem, list py_matrix) except NULL:
sm = <SparseMatrix*>mem.alloc(1, sizeof(SparseMatrix))
sm.length = len(py_matrix)
sm.rows = <SparseRow*>mem.alloc(sm.length, sizeof(SparseRow))
cdef size_t i, j
cdef dict py_row
cdef size_t idx
cdef double value
for i, py_row in enumerate(py_matrix):
sm.rows[i].length = len(py_row)
sm.rows[i].indices = <size_t*>mem.alloc(sm.rows[i].length, sizeof(size_t))
sm.rows[i].values = <double*>mem.alloc(sm.rows[i].length, sizeof(double))
for j, (idx, value) in enumerate(py_row.items()):
sm.rows[i].indices[j] = idx
sm.rows[i].values[j] = value
return sm


All that the ``Pool`` class does is remember the addresses it gives out. When the
``MatrixArray`` object is garbage-collected, the ``Pool`` object will also be garbage
collected, which triggers a call to ``Pool.__dealloc__``. The ``Pool`` then frees all of
its addresses. This saves you from walking back over your nested data structures
to free them, eliminating a common class of errors.

Custom Allocators
=================

Sometimes external C libraries use private functions to allocate and free objects,
but we'd still like the laziness of the ``Pool``.

.. code:: python

from cymem.cymem cimport Pool, WrapMalloc, WrapFree
cdef Pool mem = Pool(WrapMalloc(priv_malloc), WrapFree(priv_free))

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cymem-2.0.2.tar.gz (47.8 kB view hashes)

Uploaded Source

Built Distributions

cymem-2.0.2-cp37-cp37m-win_amd64.whl (31.3 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

cymem-2.0.2-cp37-cp37m-win32.whl (28.0 kB view hashes)

Uploaded CPython 3.7m Windows x86

cymem-2.0.2-cp37-cp37m-manylinux1_x86_64.whl (32.0 kB view hashes)

Uploaded CPython 3.7m

cymem-2.0.2-cp37-cp37m-manylinux1_i686.whl (28.9 kB view hashes)

Uploaded CPython 3.7m

cymem-2.0.2-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (52.8 kB view hashes)

Uploaded CPython 3.7m macOS 10.10+ intel macOS 10.10+ x86-64 macOS 10.6+ intel macOS 10.9+ intel macOS 10.9+ x86-64

cymem-2.0.2-cp36-cp36m-win_amd64.whl (31.2 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

cymem-2.0.2-cp36-cp36m-win32.whl (28.0 kB view hashes)

Uploaded CPython 3.6m Windows x86

cymem-2.0.2-cp36-cp36m-manylinux1_x86_64.whl (31.9 kB view hashes)

Uploaded CPython 3.6m

cymem-2.0.2-cp36-cp36m-manylinux1_i686.whl (28.8 kB view hashes)

Uploaded CPython 3.6m

cymem-2.0.2-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (52.7 kB view hashes)

Uploaded CPython 3.6m macOS 10.10+ intel macOS 10.10+ x86-64 macOS 10.6+ intel macOS 10.9+ intel macOS 10.9+ x86-64

cymem-2.0.2-cp35-cp35m-win_amd64.whl (31.1 kB view hashes)

Uploaded CPython 3.5m Windows x86-64

cymem-2.0.2-cp35-cp35m-win32.whl (27.9 kB view hashes)

Uploaded CPython 3.5m Windows x86

cymem-2.0.2-cp35-cp35m-manylinux1_x86_64.whl (31.7 kB view hashes)

Uploaded CPython 3.5m

cymem-2.0.2-cp35-cp35m-manylinux1_i686.whl (28.7 kB view hashes)

Uploaded CPython 3.5m

cymem-2.0.2-cp35-cp35m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (52.1 kB view hashes)

Uploaded CPython 3.5m macOS 10.10+ intel macOS 10.10+ x86-64 macOS 10.6+ intel macOS 10.9+ intel macOS 10.9+ x86-64

cymem-2.0.2-cp27-cp27mu-manylinux1_x86_64.whl (31.3 kB view hashes)

Uploaded CPython 2.7mu

cymem-2.0.2-cp27-cp27mu-manylinux1_i686.whl (27.7 kB view hashes)

Uploaded CPython 2.7mu

cymem-2.0.2-cp27-cp27m-win_amd64.whl (29.8 kB view hashes)

Uploaded CPython 2.7m Windows x86-64

cymem-2.0.2-cp27-cp27m-win32.whl (27.3 kB view hashes)

Uploaded CPython 2.7m Windows x86

cymem-2.0.2-cp27-cp27m-manylinux1_x86_64.whl (31.3 kB view hashes)

Uploaded CPython 2.7m

cymem-2.0.2-cp27-cp27m-manylinux1_i686.whl (27.7 kB view hashes)

Uploaded CPython 2.7m

cymem-2.0.2-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (53.7 kB view hashes)

Uploaded CPython 2.7m macOS 10.10+ intel macOS 10.10+ x86-64 macOS 10.6+ intel macOS 10.9+ intel macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page