provit

A light, dezentralized provenance tracking framework using the W3C PROV-O vocabulary

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Python 3

provit is a data provenance annotation and documentation tool. It provides various feature for creation and retrieval of provenance information for data stored in files. The tracking of sources, modifications and merges allows the user to keep a log of all modifications a dataset was subject to. This is especially useful for dataset which are accessed intermittently or part of a long running workflow (e.g. for a scientific thesis). Furthermore, provenance data stored next to the data in an archive can help others to identify quality, value and acutality of the data.

provit does not require any external infrastructure. All information is stored in .prov files right next to the data files as a JSON-LD graph. This makes it the perfect tool for small teams or individual researchers.

To allow interoperatibility, a small subset of the W3C PROV-O vocabulary is implemented. Therefore, the provenance information can easily be merge in a linked data graph if necessary, at a later stage of the project.

provit aims to provided an easy to use interface for users who have never worked with provenance tracking before. You can operate the tool using the

If you feel limited by PROVIT you should have a look at more extensive implementations, e.g.: prov.

Full documentation is available under: provit.readthedocs.io.

Quick Installation

provit is availabe via the Python Package Index (PyPI) and can be installed by using pip pip. Simply create a virtualenvironment with your preferred method a run the pip install command:

$ mkvirtualenv provit
$ pip install provit

Quickstart

provit provides three modes of interaction:

command line interface
graphical user interface
python package

All of them allow you to track provenance, but the provit browser additionally lets you explore tracked provenance.

provit browser

You can start provit browser directly from your terminal:

$ provit browser

provit cli

Simply cd to the directory, where your data is located, create (or append to an already existing) provenance file.

$ provit add FILEPATH [OPTIONS]

The –help command shows you the full list of available options and arguments.

$ provit --help

provit package

Using provit in your ETL pipeline is easy. simply import the Proveance class and start using it (e.g. as displayed below).

from provit import Provenance

# load prov data for a file, or create new prov for file
prov = Provenance(<filepath>)

# add provenance metadata
prov.add(agents=[ "agent" ], activity="activity", description="...")
prov.add_primary_source("primary_source")
prov.add_sources([ "filepath1", "filepath2" ])

# return provenance as json tree
prov_dict = prov.tree()

# save provenance metadata into "<filename>.prov" file
prov.save()

Roadmap

We have a small roadmap, which we will make transparent below:

Increase test coverage (currently 81%)
Windows support (all devs are on Linux)
Agent management in PROVIT Browser

Overview

Authors:: P. Mühleder muehleder@ub.uni-leipzig.de, F. Rämisch raemisch@ub.uni-leipzig.de
License:: MIT
Copyright:: 2018-2019, Peter Mühleder and Universitätsbibliothek Leipzig

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.1.1

Dec 3, 2019

1.1.0

Nov 27, 2019

1.0.2

Jun 17, 2019

1.0.1

Apr 25, 2019

0.2.2

Apr 25, 2018

0.2.1

Apr 25, 2018

0.2

Apr 25, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

provit-1.1.1.tar.gz (398.5 kB view hashes)

Uploaded Dec 3, 2019 Source

Built Distribution

provit-1.1.1-py3-none-any.whl (410.1 kB view hashes)

Uploaded Dec 3, 2019 Python 3

Hashes for provit-1.1.1.tar.gz

Hashes for provit-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c133f2502bebb856b92dbafffe1c7cfa4af79bbad3f380049ee3513e9a3c5bb3`
MD5	`9a9cffe011f6d164e21c6f2ef2c6dd81`
BLAKE2b-256	`f69a6f93a98067243fdc8c57e83e94c27b10bb2658c54c35b2392251fed6a392`

Hashes for provit-1.1.1-py3-none-any.whl

Hashes for provit-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7a0cbdf8bfcf48c760d024eb621aae41ed1e21fc52ecd27b3192efd8c23e9714`
MD5	`d73e8b7af6874321afcce10215dd4711`
BLAKE2b-256	`95bf89a01e1a915d61ba9345806ef70ce2581dfc32a284f5ceeaa0a0373e17b7`

provit 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Quick Installation

Quickstart

provit browser

provit cli

provit package

Roadmap

Overview

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution