jsontableschema-pandas 0.2.0

Generate Pandas data frames, load and extract data, based on JSON Table Schema descriptors.

Version v0.2 contains breaking changes:
  • removed Storage(prefix=) argument (was a stub)
  • renamed Storage(tables=) to Storage(dataframes=)
  • renamed Storage.tables to Storage.buckets
  • changed to read into memory
  • added Storage.iter to yield row by row

Getting Started


$ pip install datapackage
$ pip install jsontableschema-pandas


You can easily load resources from a data package as Pandas data frames by simply using datapackage.push_datapackage function:

>>> import datapackage

>>> data_url = ''
>>> storage = datapackage.push_datapackage(data_url, 'pandas')

>>> storage.buckets

>>> type(storage['data___data'])
<class 'pandas.core.frame.DataFrame'>

>>> storage['data___data'].head()
             Name Code
0     Afghanistan   AF
1   Åland Islands   AX
2         Albania   AL
3         Algeria   DZ
4  American Samoa   AS

Also it is possible to pull your existing data frame into a data package:

>>> datapackage.pull_datapackage('/tmp/datapackage.json', 'country_list', 'pandas', tables={
...     'data': storage['data___data'],
... })


Package implements Tabular Storage interface.

We can get storage this way:

>>> from jsontableschema_pandas import Storage

>>> storage = Storage()

Storage works as a container for Pandas data frames. You can define new data frame inside storage using storage.create method:

>>> storage.create('data', {
...     'primaryKey': 'id',
...     'fields': [
...         {'name': 'id', 'type': 'integer'},
...         {'name': 'comment', 'type': 'string'},
...     ]
... })

>>> storage.buckets

>>> storage['data'].shape
(0, 0)

Use storage.write to populate data frame with data:

>>> storage.write('data', [(1, 'a'), (2, 'b')])

>>> storage['data']
id comment
1        a
2        b

Also you can use tabulator to populate data frame from external data file:

>>> import tabulator

>>> with tabulator.Stream('data/comments.csv', headers=1) as stream:
...     storage.write('data', stream)

>>> storage['data']
id comment
1        a
2        b
1     good

As you see, subsequent writes simply appends new data on top of existing ones.


Please read the contribution guideline:

How to Contribute


File Type Py Version Uploaded on Size
jsontableschema-pandas-0.2.0.tar.gz (md5) Source 2016-10-26 8KB
jsontableschema_pandas-0.2.0-py2.py3-none-any.whl (md5) Python Wheel py2.py3 2016-10-26 9KB