Skip to main content

Anonymity library for python

Project description

PYTHON LIBRARY FOR ANONYMIZATION

This library supports the application of three classical anonymization techniques for tabular data: k-anonymity, l-diversity and t-closeness.

Installation

We recommend to use Python3 with virtualenv:

> virtualenv .venv -p python3
> source .venv/bin/activate

Then run the following command to install the library and all its requirements:

pip install python-anonymity

Documentation

The python-anonymity documentation is hosted on Read the Docs.

Getting started

Example using the crime synthetic dataset:

> import pandas as pd
> import pycanon
> from anonymity import tools
> from anonymity.tools.utils_k_anon import utils_k_anonymity as utils
> 
> d = {
>         "name": ["Joe", "Jill", "Sue", "Abe", "Bob", "Amy"],
>         "marital stat": [
>             "Separated",
>             "Single",
>             "Widowed",
>             "Separated",
>             "Widowed",
>             "Single",
>         ],
>         "age": [29, 20, 24, 28, 25, 23],
>         "ZIP code": ["32042", "32021", "32024", "32046", "32045", "32027"],
>         "crime": ["Murder", "Theft", "Traffic", "Assault", "Piracy", "Indecency"],
>     }
>     data = pd.DataFrame(data=d)
> 
>     ID = ["name"]
>     QI = ["marital stat", "age", "ZIP code"]
>     SA = ["crime"]
>     age_hierarchy = {"age": [0, 2, 5, 10]}
>     hierarchy = {
>         "marital stat": [
>             ["Single", "Not married", "*"],
>             ["Separated", "Not married", "*"],
>             ["Divorce", "Not married", "*"],
>             ["Widowed", "Not married", "*"],
>             ["Married", "Married", "*"],
>             ["Re-married", "Married", "*"],
>         ],
>         "ZIP code": [
>             ["32042", "3204*", "*"],
>             ["32021", "3202*", "*"],
>             ["32024", "3202*", "*"],
>             ["32046", "3204*", "*"],
>             ["32045", "3204*", "*"],
>             ["32027", "3202*", "*"],
>         ],
>     }
> 
>     mix_hierarchy = dict(hierarchy, **utils.create_ranges(data, age_hierarchy))

>     k = 2
>     supp_threshold = 0
>     new_data = tools.data_fly(data, ID, QI, k, supp_threshold, self.mix_hierarchy)
> 

License: Apache 2.0.

Note: the library is under heavy production, only for testing purposes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_anonymity-0.0.1.post1.tar.gz (28.0 MB view hashes)

Uploaded Source

Built Distribution

python_anonymity-0.0.1.post1-py3-none-any.whl (8.4 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page