Skip to main content

{{ DESCRIPTION }}

Project description

Travis
Coveralls
PyPi
SemVer
Gitter

A datapackage-pipelines processor to validate tabular resources using goodtables.

Install

# clone the repo and install it with pip

git clone https://github.com/frictionlessdata/datapackage-pipelines-goodtables.git
pip install -e .

Usage

Add the following to the pipeline-spec.yml configuration to validate each resource in the datapackage. A report is outputted to the logger.

...
- run: goodtables.validate
  parameters:
      fail_on_error: True,
      reports_path: 'path/to/datapackage/reports',  # where reports will be written
      datapackage_reports_path: 'reports',  # relative to datapackage.json
      write_report: True,
      goodtables:
          <key>: <value>  # options passed to goodtables.validate()
  • fail_on_error: An optional boolean to determine whether the pipeline should fail on validation error (default True).

  • reports_path: An optional string to define where Goodtables reports should be written (default is reports).

  • datapackage_reports_path: An optional string to define the path to the report, relative to the datapackage.json (see note below).

  • write_report: An optional boolean to determine whether a goodtables validation report should be written to reports_path (default is True).

  • goodtables: An optional object passed to goodtables.validate() to customise its behaviour. See `goodtables.validate() <https://github.com/frictionlessdata/goodtables-py/#validatesource-options>`__ for available options.

If reports are written, and datapackage_reports_path is defined, a reports property will be added to the datapackage, detailing the path to the report for each resource:

...
"reports": [
    {
        "resource": "my-resource",
        "reportType": "goodtables",
        "path": "path/to/my-resource.json"
    }
]

It is recommended that datapackage_reports_path is used to define a relative path, from the datapackage.json file, that represents where the report was written. datapackage_reports_path does not define where the reports will be written, but helps ensure a correct path is defined in the reports property in datapackage.json. This is useful when the pipeline concludes with a dump_to.path processor.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

datapackage_pipelines_goodtables-0.0.2a0-py2.py3-none-any.whl (6.6 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page