Skip to main content

A django add-on that allows models to be decorated with information about which fields contain sensitive information, and an associated management command that creates a script to remove that information.

Project description

A django add-on that allows models to be decorated with information about which fields contain sensitive information, and an associated management command that creates a script to remove that information.

https://travis-ci.org/MatthewWilkes/django-scrub-pii.svg?branch=master https://coveralls.io/repos/github/MatthewWilkes/django-scrub-pii/badge.svg?branch=master

INSTALL

$ pip install django-scrub-pii

USAGE

Add scrubpii to your settings file:

INSTALLED_APPS = (
    ...,
    ...,
    ...,
    'scrubpii',
)

Sensitive fields are marked by adding a sensitive_fields list to the model’s Meta class. As the fields in the Meta class are fixed, Django needs to be patched to allow the new field. To ensure isolation and warn if compatibility problems happen in future, this is achieved by defining the model within a context manager:

from scrubpii import allow_sensitive_fields

with allow_sensitive_fields():
    class Person(models.Model):
        first_name = models.CharField(max_length=30)
        last_name = models.CharField(max_length=30)
        date_of_birth = models.DateField()
        email = models.EmailField()

        def __unicode__(self):
            return "{0} {1}".format(self.first_name, self.last_name)

        class Meta:
            sensitive_fields = {'last_name', 'first_name', 'email', 'date_of_birth'}

This can be achieved easily by separating the sensitive models out into a new file, as so:

from django.db import models
from scrubpii import allow_sensitive_fields

with allow_sensitive_fields():
    from .sensitive_models import *

where sensitive_models.py is:

from django.db import models

__all__ = ['Person']

class Person(models.Model):
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)
    date_of_birth = models.DateField()
    email = models.EmailField()

    def __unicode__(self):
        return "{0} {1}".format(self.first_name, self.last_name)

    class Meta:
        sensitive_fields = {'last_name', 'first_name', 'email', 'date_of_birth'}

If you need to mark fields on third party models as sensitive you can do so using settings.py:

SCRUB_PII_ADDITIONAL_FIELDS = {'auth.User': {'email',
                                             'first_name',
                                             'last_name',
                                             'password',
                                             'username',
                                             },
                               'testapp.Book': {'title', },
                               'testapp.Example': {'foo', }
                              }

Once the sensitive fields are defined a management command will generate SQL statements to anonymize a database. This app will not anonymize the database directly to avoid the risk of damaging live data.

The script can be generated by running the management command:

$ python manage.py get_sensitive_data_removal_script > scrub.sql

The suggested workflow is:

  1. Dump database

  2. Reload dump into a temporary database on a secure server (or copy sqlite.db if sqlite)

  3. Generate anonymisation script

  4. Run anonymisation script against temporary database

  5. Dump temporary database

  6. Delete temporary database

  7. Transmit temporary database to insecure server

SUPPORTED DATABASES

Currently, postgresql and sqlite only are supported. Patches to add other databases or fields welcome.

Note, the anonymisation under sqlite is more comprehensive than under postgresql. For example, under sqlite IP addresses will be anonymised to the same value, whereas under postgres different IPs will be anonymised to differing values.

DEVELOP

$ git clone django-scrub-pii
$ cd django-scrub-pii
$ make

RUNNING TESTS

$ tox

Changelog

1.1 (2016-01-29)

  • Allow specification of additional model fields to treat as sensitive using django settings. [MatthewWilkes]

1.0 (2016-01-29)

  • Initial release, basic support for built in field types, especially on postgres. Limited sqlite support. [MatthewWilkes]

django-scrub-pii Copyright (c) 2016, Matthew Wilkes
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
   derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django-scrub-pii-1.1.tar.gz (12.8 kB view hashes)

Uploaded Source

Built Distribution

django_scrub_pii-1.1-py2.py3-none-any.whl (22.7 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page