Skip to main content

A versioning solution for relational data models

Project description

**************
CleanerVersion
**************

Abstract
========

CleanerVersion is a solution that allows you to read and write multiple versions of an entry to and from your
relational database. It allows to keep track of modifications on an object over time, as described by the theory of
**Slowly Changing Dimensions** (SCD) **- Type 2**.

CleanerVersion therefore enables a Django-based Datawarehouse, which was the initial idea of this package.


Features
========

CleanerVersion's feature-set includes the following bullet points:

* Simple versioning of an object (according to SCD, Type 2)

- Retrieval of the current version of the object
- Retrieval of an object's state at any point in time

* Versioning of One-to-Many relationships

- For any point in time, retrieval of correct related objects

* Versioning of Many-to-Many relationships

- For any point in time, retrieval of correct related objects


Prerequisites
=============

This code was tested with the following technical components

* Python 2.7
* Django 1.6
* PostgreSQL 9.3.4


Quick Start
===========

Once you have your Django project in place, and the CleanerVersion app installed and registered, create your models as
follows.

A simple versionable model
--------------------------

First, import all the necessary modules. In this example, all the imports are done in the beginning, such that this
would be a working example, if place in the same source file. Here's how::

from datetime import datetime
from django.db.models.fields import CharField
from django.utils.timezone import utc
from versions.models import Versionable

class Person(Versionable):
name = CharField(max_length=200)
address = CharField(max_length=200)
phone = CharField(max_length=200)

Assuming you know how to deal with :mod:`Django Models <django.db.models>` (you will need to sync your DB before your
code gets usable; Or you're only testing, then that step is done by Django), the next step is using your model to create
some entries::

p = Person.objects.create(name='Donald Fauntleroy Duck', address='Duckburg', phone='123456')
t1 = datetime.utcnow().replace(tzinfo=utc)

p = p.clone() # Important! Fetch the returned object, it's the current one! Continue work with this one.
p.address = 'Entenhausen'
p.save()
t2 = datetime.utcnow().replace(tzinfo=utc)

p = p.clone()
p.phone = '987654'
p.save()
t3 = datetime.utcnow().replace(tzinfo=utc)

Now, let's query the entries::

donald_current = Person.objects.as_of().get(name__startswith='Donald') # Get the current entry
print str(donald_current.address) # Prints 'Entenhausen'
print str(donald_current.phone) # Prints '987654'

donald_t1 = Person.objects.as_of(t1).get(name__startswith='Donald') # Get a historic entry
print str(donald_t1.address) # Prints 'Duckburg'
print str(donald_t1.phone) # Prints '123456'

A related versionable model
---------------------------

Here comes the less simple approach. What we are going to set up is both, a Many-to-One- and a
Many-to-Many-relationship. Keep in mind, that this is just an example and we try to focus on the
relationship part, rather than the semantical correctness of the entries' fields::

from datetime import datetime
from django.db.models.fields import CharField
from django.utils.timezone import utc
from versions.models import Versionable, VersionedManyToManyField, VersionedForeignKey

class Discipline(Versionable):
"""A sports discipline"""
name = CharField(max_length=200)
rules = CharField(max_length=200)

class SportsClub(Versionable):
"""Sort of an association for practicing sports"""
name = CharField(max_length=200)
practice_periodicity = CharField(max_length=200)
discipline = VersionedForeignKey('Discipline')

class Person(Versionable):
name = CharField(max_length=200)
phone = CharField(max_length=200)
sportsclubs = VersionedManyToManyField('SportsClub', related_name='members')

Here comes the data loading for demo::

running = Discipline.objects.create(name='Running', rules='There are none (almost)')
icehockey = Discipline.objects.create(name='Ice Hockey', rules='There\'s a ton of them')

stb = SportsClub.objects.create(name='STB', practice_periodicity='tuesday and thursday night', discipline=running)
hcfg = SportsClub.objects.create(name='HCFG', practice_periodicity='monday, wednesday and friday night', discipline=icehockey)

peter = Person.objects.create(name='Peter', phone='123456')
mary = Person.objects.create(name='Mary', phone='987654')

# Bringing things together
# Peter wants to run
peter.sportsclubs.add(stb)

t1 = datetime.utcnow().replace(tzinfo=utc)

# Peter later joins HCFG for ice hockey
hcfg.members.add(peter)

# Mary joins STB for running
stb.members.add(mary)

t2 = datetime.utcnow().replace(tzinfo=utc)

# HCFG changes the paractice times
hcfg = hcfg.clone()
hcfg.practice_periodicity = 'monday, wednesday and thursday'
hcfg.save()

# Too bad, new practice times don't work out for Peter anymore, he leaves HCFG
hcfg.members.remove(peter)
t3 = datetime.utcnow().replace(tzinfo=utc)

Let's continue with the queries, to check, whether all that story can be reconstructed::

### Querying for timestamp t1
sportsclub = SportsClub.objects.as_of(t1).get(name='HCFG')
print "Number of " + sportsclub.name + " (" + sportsclub.discipline.name + ") members: " + str(sportsclub.members.count())
for member in list(sportsclub.members.all()):
print "- " + str(member.name) # prints ""

sportsclub = SportsClub.objects.as_of(t1).get(name='STB')
print "Number of " + sportsclub.name + " (" + sportsclub.discipline.name + ") members: " + str(sportsclub.members.count())
for member in list(sportsclub.members.all()):
print "- " + str(member.name) # prints "- Peter"


### Querying for timestamp t2
sportsclub = SportsClub.objects.as_of(t2).get(name='HCFG')
print "Number of " + sportsclub.name + " (" + sportsclub.discipline.name + ") members: " + str(sportsclub.members.count())
for member in list(sportsclub.members.all()):
print "- " + str(member.name) # prints "- Peter"


### Querying for timestamp t3
sportsclub = SportsClub.objects.as_of(t3).get(name='HCFG')
print "Number of " + sportsclub.name + " (" + sportsclub.discipline.name + ") members: " + str(sportsclub.members.count())
for member in list(sportsclub.members.all()):
print "- " + str(member.name) # prints ""

sportsclub = SportsClub.objects.as_of(t3).get(name='STB')
print "Number of " + sportsclub.name + " (" + sportsclub.discipline.name + ") members: " + str(sportsclub.members.count())
for member in list(sportsclub.members.all()):
print "- " + str(member.name) # prints "- Peter\n- Mary"

Pretty easy, isn't it? ;)

Feature requests
================

- Querying for time ranges

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

CleanerVersion-0.1.zip (26.1 kB view hashes)

Uploaded Source

CleanerVersion-0.1.tar.gz (22.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page