Anonymity library for python
Project description
PYTHON LIBRARY FOR ANONYMIZATION
This library supports the application of three classical anonymization techniques for tabular data: k-anonymity, l-diversity and t-closeness.
Installation
We recommend to use Python3 with virtualenv:
> virtualenv .venv -p python3
> source .venv/bin/activate
Then run the following command to install the library and all its requirements:
pip install python-anonymity
Documentation
The python-anonymity documentation is hosted on Read the Docs.
Getting started
Example using the crime synthetic dataset:
> import pandas as pd
> import pycanon
> from anonymity import tools
> from anonymity.tools.utils_k_anon import utils_k_anonymity as utils
>
> d = {
> "name": ["Joe", "Jill", "Sue", "Abe", "Bob", "Amy"],
> "marital stat": [
> "Separated",
> "Single",
> "Widowed",
> "Separated",
> "Widowed",
> "Single",
> ],
> "age": [29, 20, 24, 28, 25, 23],
> "ZIP code": ["32042", "32021", "32024", "32046", "32045", "32027"],
> "crime": ["Murder", "Theft", "Traffic", "Assault", "Piracy", "Indecency"],
> }
> data = pd.DataFrame(data=d)
>
> ID = ["name"]
> QI = ["marital stat", "age", "ZIP code"]
> SA = ["crime"]
> age_hierarchy = {"age": [0, 2, 5, 10]}
> hierarchy = {
> "marital stat": [
> ["Single", "Not married", "*"],
> ["Separated", "Not married", "*"],
> ["Divorce", "Not married", "*"],
> ["Widowed", "Not married", "*"],
> ["Married", "Married", "*"],
> ["Re-married", "Married", "*"],
> ],
> "ZIP code": [
> ["32042", "3204*", "*"],
> ["32021", "3202*", "*"],
> ["32024", "3202*", "*"],
> ["32046", "3204*", "*"],
> ["32045", "3204*", "*"],
> ["32027", "3202*", "*"],
> ],
> }
>
> mix_hierarchy = dict(hierarchy, **utils.create_ranges(data, age_hierarchy))
> k = 2
> supp_threshold = 0
> new_data = tools.data_fly(data, ID, QI, k, supp_threshold, self.mix_hierarchy)
>
License: Apache 2.0.
Note: the library is under heavy production, only for testing purposes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
python_anonymity-0.0.1.tar.gz
(24.2 MB
view hashes)
Built Distribution
Close
Hashes for python_anonymity-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e62d6d47880d95292304fb4a10385ade935605bff4d195b537660cfa21c81fdd |
|
MD5 | 8d9d0427b51cee88b98cf06c1a8a3610 |
|
BLAKE2b-256 | 01ad74fc255d138ad324b2f6488055c0af85b181937091e90eaed9b1347c8591 |