skip to navigation
skip to content

Not Logged In

ferenda 0.2.0

Transform unstructured document collections to structured Linked Data

Latest Version: 0.3.0

Ferenda is a python library and framework for transforming unstructured document collections into structured Linked Data. It helps with downloading documents, parsing them to add explicit semantic structure and RDF-based metadata, finding relationships between documents, and publishing the results, including through a REST-based HTTP API.

Quick start

This example uses ferenda’s project framework to download the 50 latest RFCs and W3C standards, parse documents into structured, RDF-enabled XHTML documents, loads all RDF metadata into a triplestore and generates a web site of static HTML5 files that are usable offline:

pip install ferenda
ferenda-setup myproject
cd myproject
./ferenda-build.py ferenda.sources.tech.RFC enable
./ferenda-build.py ferenda.sources.tech.W3Standards enable
./ferenda-build.py all all --downloadmax=50 --staticsite --fulltextindex=False
open data/index.html

The same functionality can also be accessed through a python API, if you want to use ferenda as part of a larger system. It’s also possible to just use the parts of ferenda that you need (eg. only the downloading and parsing features).

More information

See http://ferenda.readthedocs.org/ for in-depth documentation.

 
File Type Py Version Uploaded on Size
ferenda-0.2.0-py2.py3-none-any.whl (md5) Python Wheel py2.py3 2014-07-23 741KB
ferenda-0.2.0.tar.gz (md5) Source 2014-07-23 740KB
  • Downloads (All Versions):
  • 0 downloads in the last day
  • 108 downloads in the last week
  • 516 downloads in the last month