skip to navigation
skip to content

Not Logged In

madoka 0.3

Memory-efficient Count-Min Sketch key-value structure (based on Madoka C++ library)

Latest Version: 0.6

madoka

Madoka is an implementation of a Count-Min sketch data structure for summarizing data streams.

String-int pairs in a Madoka-Sketch may take less memory than in a standard Python dict.

Based on madoka C++ library.

NOTE: Madoka-Sketch does not have index of keys. so Madoka-Sketch can not dump all keys such as Python dict’s dict.keys().

Installation

$ pip install madoka

Usage

Create a new sketch

>>> import madoka
>>> sketch = madoka.Sketch()
  • madoka.Sketch(width=0, max_value=0, path=NULL, flags=0, seed=0)
    • madoka.Sketch() calls madoka.Sketch.create(), so you don’t have to explicitly call create()

Increment a key value

>>> sketch.inc('mami')
  • inc(key[, key_length])
    • Note that key_length is automatically determined when not giving key_length. Thus, the order of parameters differs from original madoka C++ library.

Add a value to the current key value

>>> sketch.add('mami', 6)
  • add(key, value[, key_length])
    • Note that key_length is automatically determined when not giving key_length. Thus, the order of parameters differs from original madoka C++ library.

Update a key value

>>> sketch.set('mami', 6)
  • set(key, value[, key_length])
    • Note that set() does nothing when the given value is not greater than the current key value.
    • Also note that the new value is saturated when the given value is greater than the upper limit.
    • Additionally note that key_length is automatically determined when not giving key_length. Thus, the order of parameters differs from original madoka C++ library.

Get a key value

>>> sketch.get('mami')
  • get(key[, key_length])
    • Note that key_length is automatically determined when not giving key_length. Thus, the order of parameters differs from original madoka C++ library.

Save a sketch to a file

>>> sketch.save('example.madoka')
  • save(filename)

Load a sketch from a file

>>> sketch.load('example.madoka')
  • load(filename)

Clear a sketch

>>> sketch.clear()
  • clear()
    • Delete all key-value pairs. It differs from create() in maintaining settings.

Initialize a sketch with settings change

>>> sketch.create()
  • create(width=0, max_value=0, path=NULL, flags=0, seed=0)

Copy a sketch

>>> sketch.copy(othersketch)
  • copy(Sketch)

Merge two sketches

>>> sketch.merge(othersketch)
  • merge(Sketch)

Get inner product of two sketches

>>> sketch.inner_product(other_sketch)
  • inner_product(Sketch)

TODO

  • Filter function performing same behavior with original C++ madoka library

Contributions are welcome!

License

  • Wrapper code is licensed under New BSD License.
  • Bundled madoka C++ library is licensed under the Simplified BSD License.

CHANGES

0.3 (2014-03-14)

Key length is automatically determined when it is not given. Remove filter function. Slightly decreasing amount of memory usage.

0.2 (2013-10-12)

Simplify the step of creating new sketch.

0.1 (2013-10-11)

Initial release.

 
File Type Py Version Uploaded on Size
madoka-0.3.tar.gz (md5) Source 2014-03-14 59KB
  • Downloads (All Versions):
  • 24 downloads in the last day
  • 107 downloads in the last week
  • 415 downloads in the last month