Skip to main content

Probabilistic type inference

Project description

build-publish on release build on develop Binder

Install requirements

pip install -r requirements.txt

Usage

Initialization

ptype = Ptype()

By default, ptype considers the following data types: integer, string, float, boolean, gender, 'date'. However, this can be extended using the parameter named ''_types''.

ptype = Ptype(_types={1:'integer', 2:'string', 3:'float', 4:'boolean', 5:'gender', 6:'date-iso-8601', 7:'date-eu', 8:'date-non-std-subtype', 9:'date-non-std', 10:'IPAddress', 11:'EmailAddress')

Running

ptype.run_inference(_data_frame=df)

Summarizing the Results

We can generate a human-readable description of the predictions, such as the posterior distribution of the column types, the most likely column types, missing or anomalies entries and the fractions of normal, missing and anomalous entries in the columns.

ptype.show_results()

By default, it prints the descriptions for all of the columns. Alternatively, you can find the columns that contain missing data or anomalies, and only show the results for these columns.

column_names = ptype.get_columns_with_missing()
ptype.show_results(column_names)
column_names = ptype.get_columns_with_anomalies()
ptype.show_results(column_names)

Another way of presenting the column type predictions is to ember them in the header:

ptype.show_results_df()

Interactions

In addition to printing these outputs, we can change the predictions when needed.

a. Updating the Column Type Predictions

ptype.change_column_types()

b. Updating the Missing Data Annotations

ptype.change_missing_data_annotations()

c. Updating the Anomaly Annotations

ptype.change_anomaly_annotations()

d. Merging Different Encodings of Missing Data

ptype.merge_missing_data()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptype-0.1.15.tar.gz (27.7 kB view hashes)

Uploaded Source

Built Distribution

ptype-0.1.15-py3-none-any.whl (29.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page