Skip to main content

Diff-it: Spark Dataframe Differ

Project description

Diff-it: Data Differ

Overview

diffit will report differences between two data sets with similar schema.

Refer to Diffit's documentation for detailed instructions.

Prerequisites

Getting Started

Makester is used as the Integrated Developer Platform.

(macOS Users only) Upgrading GNU Make

Follow these notes to get GNU make.

Creating the Local Environment

Get the code and change into the top level git project directory:

git clone git@github.com:loum/diffit.git && cd diffit

NOTE: Run all commands from the top-level directory of the git repository.

For first-time setup, get the Makester project:

git submodule update --init

Initialise the environment:

make init-dev

Local Environment Maintenance

Keep Makester project up-to-date with:

git submodule update --remote --merge

Help

There should be a make target to get most things done. Check the help for more information:

make help

Running the Test Harness

We use pytest. To run the tests:

make tests

FAQs

Q. Why do I get WARNING: An illegal reflective access operation has occurred? Seems to be related to the JVM version being used. Java 8 will suppress the warning. To check available Java versions on your Mac try /usr/libexec/java_home -V. Then:

export JAVA_HOME=$(/usr/libexec/java_home -v <java_version>)

top

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffit-0.1.5.tar.gz (14.1 kB view hashes)

Uploaded Source

Built Distribution

diffit-0.1.5-py3-none-any.whl (14.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page