Skip to main content

Python utility to reconcile Pandas DataFrames

Project description

reconciler

license pytest status

reconciler is a python package to reconcile tabular data with various reconciliation services, such as Wikidata, working similarly to what OpenRefine does, but entirely within Python, using Pandas.

Quickstart

You can install the latest version of reconciler from PyPI with:

pip install reconciler

Then to use it:

from reconciler import reconcile
import pandas as pd

# A DataFrame with a column you want to reconcile.
test_df = pd.DataFrame(
    {
        "City": ["Rio de Janeiro", "São Paulo", "São Paulo", "Natal"],
    }
)

# Reconcile against type city (Q515), getting the best match for each item.
reconciled = reconcile(test_df["City"], type_id="Q515")

The resulting dataframe would look like this:

id match name score type type_id input_value
Q8678 True Rio de Janeiro 100 city Q515 Rio de Janeiro
Q174 True São Paulo 100 city Q515 São Paulo
Q131620 True Natal 100 municipality of Brazil Q3184121 Natal

Check out the documentation for more advanced usage and to learn how to contribute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reconciler-0.1.2.tar.gz (5.1 kB view hashes)

Uploaded Source

Built Distribution

reconciler-0.1.2-py2.py3-none-any.whl (6.2 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page