Skip to main content

Python bindings for the sophia database.

Project description

Sophia Library Latest Version https://img.shields.io/pypi/wheel/sonya.svg https://img.shields.io/pypi/pyversions/sonya.svg https://img.shields.io/pypi/l/sonya.svg

Sonya

sonya, fast Python bindings for Sophia embedded database, v2.2.

About sonya:

  • Written in Cython for speed and low-overhead

  • Clean, memorable APIs

  • Extensive support for Sophia’s features

  • Python 2 and Python 3 support

  • No 3rd-party dependencies besides Cython (for python>3)

About Sophia:

Sophia Library
  • Document store

  • ACID transactions

  • MVCC, optimistic, non-blocking concurrency with multiple readers and writers.

  • Multiple databases per environment

  • Multiple- and single-statement transactions across databases

  • Prefix searches

  • Automatic garbage collection

  • Hot backup

  • Compression

  • Multi-threaded compaction

  • mmap support, direct I/O support

  • APIs for variety of statistics on storage engine internals

  • BSD licensed

Some ideas of where Sophia might be a good fit:

  • Running on application servers, low-latency / high-throughput

  • Time-series

  • Analytics / Events / Logging

  • Full-text search

  • Secondary-index for external data-store

Limitations:

  • Not tested on Windows.

If you encounter any bugs in the library, please open an issue, including a description of the bug and any related traceback.

Installation

The sophia sources are bundled with the sonya source code, so the only thing you need to install is Cython. You can install from GitHub or from PyPi.

Pip instructions:

$ pip install Cython   # Optional
$ pip install sonya

Or to install the latest code from master:

$ pip install Cython   # Required
$ pip install git+https://github.com/mosquito/sonya#egg=sonya

Git instructions:

$ pip install Cython
$ git clone https://github.com/mosquito/sonya
$ cd sonya
$ python setup.py build
$ python setup.py install

To run the tests:

$ pip install pytest
$ pytest tests

Overview

Sonya addition to normal dictionary operations, you can read slices of data that are returned efficiently using cursors. Similarly, bulk writes using update() use an efficient, atomic batch operation.

Despite the simple APIs, Sophia has quite a few advanced features. There is too much to cover everything in this document, so be sure to check out the official Sophia storage engine documentation.

The next section will show how to perform common actions with sonya.

Using Sonya

Let’s begin by import sonya and creating an environment. The environment can host multiple databases, each of which may have a different schema. In this example our database will store python objects as the key and value. Finally we’ll open the environment so we can start storing and retrieving data.

from sonya import Environment, fields, Schema


class DictSchema(Schema):
    key = fields.PickleField(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-database', DictSchema(), compression='zstd')
env.open()

document = db.document(key='foo', value=[1, 2, 3, 'bar'])

# Insert a document
db.set(document)

print(db.get(key='foo'))
# {'value': [1, 2, 3, 'bar'], 'key': 'foo'}

CRUD operations

Sonya

from sonya import Environment, fields, Schema


class DictSchema(Schema):
    key = fields.PickleField(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-database', DictSchema(), compression='zstd')
env.open()

document = db.document(key='foo', value=[1, 2, 3, 'bar'])

# Create a document
db.set(document)

# Read document
document = db.get(key='foo')

# Update the document
document = db.document(key='foo', value=None)
db.set(document)

# Delete the document
document = db.document(key='foo', value=None)
db.delete(key='foo')

# Iterate through the database
for document in db.cursor():
   print(document)

# Delete multiple documents
# fastest method for remove multiple documents from database
db.delete_many(order='>=')

Fetching ranges (Cursors)

Because Sophia is an ordered data-store, performing ordered range scans is efficient. To retrieve a range of key-value pairs with Sonya dictionary lookup with a slice instead.

For finer-grained control over iteration, or to do prefix-matching, Sonya provides a cursor interface.

The cursor() method accepts special keyword parameter order and all key fields:

  • order (default=`>=`) – semantics for matching the start key and ordering results.

from sonya import Environment, fields, Schema


class IntSchema(Schema):
    key = fields.UInt32Field(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-integer-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=i, value=None))

# Iterate through the database
for document in db.cursor(order='>=', key=9995):
    print(document)

# {'key': 9995, 'value': None}
# {'key': 9996, 'value': None}
# {'key': 9997, 'value': None}
# {'key': 9998, 'value': None}
# {'key': 9999, 'value': None}

For prefix search use a part of the key and order:

from sonya import Environment, fields, Schema


class StringSchema(Schema):
    key = fields.StringField(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-string-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=str(i), value=None))

# Iterate through the database
for document in db.cursor(order='>=', key='999'):
    print(document)

# {'value': None, 'key': '999'}
# {'value': None, 'key': '9990'}
# {'value': None, 'key': '9991'}
# {'value': None, 'key': '9992'}
# {'value': None, 'key': '9993'}
# {'value': None, 'key': '9994'}
# {'value': None, 'key': '9995'}
# {'value': None, 'key': '9996'}
# {'value': None, 'key': '9997'}
# {'value': None, 'key': '9998'}
# {'value': None, 'key': '9999'}

Deleting multiple documents

Sonya provides delete_many method. This method is fastest option when you want to remove multiple documents from the database. The method has cursor-like interface. The whole operation will be processed in the one transaction.

The method returns number of affected rows.

from sonya import Environment, fields, Schema


class IntSchema(Schema):
    key = fields.UInt32Field(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-integer-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=i, value=None))

# returns the number of affected rows
db.delete_many(order='>=', key=9995):

Document count

The Database objects has a __len__ method. Please avoid to use it for any big database, it iterates and count the documents each time (faster then using len(list(db.cursor())) but still has O(n) complexity).

from sonya import Environment, fields, Schema


class IntSchema(Schema):
    key = fields.UInt32Field(index=0)
    value = fields.PickleField()


env = Environment('/tmp/test-env')
db = env.database('test-integer-db', IntSchema(), compression='zstd')
env.open()


with db.transaction() as tx:
    for i in range(10000):
        tx.set(db.document(key=i, value=None))

print(len(db))
# 10000

Transactions

Sophia supports ACID transactions. Even better, a single transaction can cover operations to multiple databases in a given environment.

Example usage:

class Users(Schema):
    name = fields.StringField(index=0)
    surname = fields.StringField(index=1)
    age = fields.UInt8Field()


with users.transaction() as tx:
    tx.set(users.document(name='Jane', surname='Doe', age=19))
    tx.set(users.document(name='John', surname='Doe', age=18))

    # Raises LookupError
    db.get(name='John', surname='Doe')

Multiple transactions are allowed to be open at the same time, but if there are conflicting changes, an exception will be thrown when attempting to commit the offending transaction.

Configuring and Administering Sophia

Sophia can be configured using special properties on the Sophia and Database objects. Refer to the configuration document for the details on the available options, including whether they are read-only, and the expected data-type.

For example, to query Sophia’s status, you can use the status property, which is a readonly setting returning a string:

>>> print(env['sophia.status'])
"online"

Other properties can be changed by assigning a new value to the property. For example, to read and then increase the number of threads used by the scheduler:

>>> env['scheduler.threads'] = env['scheduler.threads'] + 2
>>> env.open()
>>> print(env['scheduler.threads'])
8
>>> print(dict(env))
{'db.test-string-db.stat.cursor_latency': '0 0 0.0', ...}

Refer to the documentation for complete lists of settings. Dotted-paths are translated into underscore-separated attributes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sonya-0.6.6.tar.gz (290.2 kB view hashes)

Uploaded Source

Built Distributions

sonya-0.6.6-cp36-cp36m-manylinux1_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.6m

sonya-0.6.6-cp36-cp36m-macosx_10_6_intel.whl (862.0 kB view hashes)

Uploaded CPython 3.6m macOS 10.6+ intel

sonya-0.6.6-cp36-cp36m-linux_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.6m

sonya-0.6.6-cp35-cp35m-manylinux1_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.5m

sonya-0.6.6-cp35-cp35m-macosx_10_6_intel.whl (1.5 MB view hashes)

Uploaded CPython 3.5m macOS 10.6+ intel

sonya-0.6.6-cp34-cp34m-manylinux1_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.4m

sonya-0.6.6-cp34-cp34m-macosx_10_6_intel.whl (1.5 MB view hashes)

Uploaded CPython 3.4m macOS 10.6+ intel

sonya-0.6.6-cp27-cp27m-manylinux1_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 2.7m

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page