pandaSDMX

A client for SDMX - Statistical Data and Metadata eXchange

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

pandaSDMX is an Apache 2.0-licensed Python package aimed at becoming the most intuitive and versatile tool to retrieve and acquire statistical data and metadata disseminated in SDMX format. It supports out of the box the SDMX services of the European statistics office (Eurostat), the European Central Bank (ECB), and the French National Institute for statistics (INSEE). pandaSDMX can export data and metadata as pandas DataFrames, the gold-standard of data analysis in Python. From pandas you can export data and metadata to Excel, R and friends. As from version 0.4, pandaSDMX can export data to many other file formats and database backends via Odo.

Main features

intuitive API inspired by requests
support for many SDMX features including
- generic datasets
- data structure definitions, code lists and concept schemes
- dataflow definitions and content-constraints
- categorisations and category schemes
pythonic representation of the SDMX information model
When requesting datasets, validate column selections against code lists and content-constraints if available
export data and metadata as multi-indexed pandas DataFrames or Series, and many other formats and database backends via Odo
read and write SDMX messages to and from local files
configurable HTTP connections
support for requests-cache allowing to cache SDMX messages in memory, MongoDB, Redis or SQLite
extensible through custom readers and writers for alternative input and output formats of data and metadata
growing test suite

For further details including extensive code examples see the documentation .

pandaSDMX Links

Recent changes

v0.4 (2016-04-11)

New features

add new provider INSEE, the French statistics office (thanks to Stéphan Rault)
register ‘.sdmx’ files with Odo if available
logging of http requests and file operations.
new structure2pd writer to export codelists, dataflow-definitions and other structural metadata from structure messages as multi-indexed pandas DataFrames. Desired attributes can be specified and are represented by columns.

API changes

pandasdmx.api.Request constructor accepts a log_level keyword argument which can be set to a log-level for the pandasdmx logger and its children (currently only pandasdmx.api)
pandasdmx.api.Request now has a timeout property to set the timeout for http requests
extend api.Request._agencies configuration to specify agency- and resource-specific settings such as headers. Future versions may exploit this to provide reader selection information.
api.Request.get: specify http_headers per request. Defaults are set according to agency configuration
Response instances expose Message attributes to make application code more succinct
rename pandasdmx.api.Message attributes to singular form Old names are deprecated and will be removed in the future.
pandasdmx.api.Request exposes resource names such as data, datastructure, dataflow etc. as descriptors calling ‘get’ without specifying the resource type as string. In interactive environments, this saves typing and enables code completion.
data2pd writer: return attributes as namedtuples rather than dict
use patched version of namedtuple that accepts non-identifier strings as field names and makes all fields accessible through dict syntax.
remove GenericDataSet and GenericDataMessage. Use DataSet and DataMessage instead
sdmxml reader: return strings or unicode strings instead of LXML smart strings
sdmxml reader: remove most of the specialized read methods. Adapt model to use generalized methods. This makes code more maintainable.
pandasdmx.model.Representation for DSD attributes and dimensions now supports text not just codelists.

Other changes and enhancements

documentation has been overhauled. Code examples are now much simpler thanks to the new structure2pd writer
testing: switch from nose to py.test
improve packaging. Include tests in sdist only
numerous bug fixes

v0.3.1 (2015-10-04)

This release fixes a few bugs which caused crashes in some situations.

v0.3.0 (2015-09-22)

support for requests-cache allowing to cache SDMX messages in memory, MongoDB, Redis or SQLite
pythonic selection of series when requesting a dataset: Request.get allows the key keyword argument in a data request to be a dict mapping dimension names to values. In this case, the dataflow definition and datastructure definition, and content-constraint are downloaded on the fly, cached in memory and used to validate the keys. The dotted key string needed to construct the URL will be generated automatically.
The Response.write method takes a parse_time keyword arg. Set it to False to avoid parsing of dates, times and time periods as exotic formats may cause crashes.
The Request.get method takes a memcache keyward argument. If set to a string, the received Response instance will be stored in the dict Request.cache for later use. This is useful when, e.g., a DSD is needed multiple times to validate keys.
fixed base URL for Eurostat
major refactorings to enhance code maintainability

v0.2.2 (2015-05-19)

Make HTTP connections configurable by exposing the requests.get API through the pandasdmx.api.Request constructor. Hence, proxy servers, authentication information and other HTTP-related parameters consumed by requests.get can be set for an Request instance and used in subsequent requests. The configuration is exposed as a dict through the Request.client.config attribute.
Responses now have an http_headers attribute containing the headers returned by the SDMX server

v0.2.1 (2015-04-22)

API: add support for zip archives received from an SDMX server. This is common for large datasets from Eurostat
incidentally get a remote resource if the footer of a received message specifies an URL. This pattern is common for large datasets from Eurostat.
allow passing a file-like object to api.Request.get()
enhance documentation
make pandas writer parse more time period formats and increase its performance

v0.2.0 (2015-04-13)

This version is a quantum leap. The whole project has been redesigned and rewritten from scratch to provide robust support for many SDMX features. The new architecture is centered around a pythonic representation of the SDMX information model. It is extensible through readers and writers for alternative input and output formats. Export to pandas has been dramatically improved. Sphinx documentation has been added.

v0.1 (2014-09)

Initial release

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.10.0

Feb 25, 2023

1.9.0

Feb 23, 2022

1.8.1

Feb 2, 2022

1.8.0

Jan 28, 2022

1.7.1

Jan 26, 2022

1.7.0

Jan 13, 2022

1.6.0

May 26, 2021

1.5.0

Apr 11, 2021

1.4.2

Mar 28, 2021

1.4.1

Feb 25, 2021

1.4.0

Feb 4, 2021

1.3.0

Jan 1, 2021

1.2.0

Oct 28, 2020

1.1.0

Aug 4, 2020

1.0.1

May 28, 2020

1.0.0

May 15, 2020

1.0.0rc2 pre-release

May 14, 2020

1.0.0rc1 pre-release

May 13, 2020

1.0.0b2 pre-release

Jan 21, 2020

1.0b1 pre-release

Aug 21, 2019

0.9

Jul 12, 2018

0.8.2

Dec 23, 2017

0.8.1

Dec 20, 2017

0.8

Dec 12, 2017

0.7.0

Jun 10, 2017

0.6.1

Feb 3, 2017

0.6

Jan 7, 2017

0.5.2

Oct 30, 2016

0.5.1

Oct 23, 2016

0.5

Oct 23, 2016

This version

0.4.1

Apr 12, 2016

0.3.1

Oct 5, 2015

0.3.0

Sep 22, 2015

0.2.2

May 19, 2015

0.2.1

Apr 26, 2015

0.2.0

Apr 13, 2015

0.1.2

Sep 17, 2014

0.1.1

Sep 16, 2014

0.1

Sep 7, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandaSDMX-0.4.1.tar.gz (276.8 kB view hashes)

Uploaded Apr 12, 2016 Source

Built Distribution

pandaSDMX-0.4.1-py2.py3-none-any.whl (32.8 kB view hashes)

Uploaded Apr 12, 2016 Python 2 Python 3

Hashes for pandaSDMX-0.4.1.tar.gz

Hashes for pandaSDMX-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`546e2a1de37c29c1bc349460cbe798c9cea07accdaae0e3a9ce1fc60fe4dc62d`
MD5	`b7fffea041a24b76b4aa108927bd400a`
BLAKE2b-256	`675d05665ac53ec7d40c8d6210f922356fbcc4d662303671446b469ff371bda4`

Hashes for pandaSDMX-0.4.1-py2.py3-none-any.whl

Hashes for pandaSDMX-0.4.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`aa91ee3eb05a2dfb85ebe7369961bc2418292510366af62a5fe8a3a4ba7fa259`
MD5	`c85f82435cc94f2f1110b1e6541aa910`
BLAKE2b-256	`b72d2eb2353c2a6ed2235c6a291de8748982046f90eb7c04793bfa8c9e9a8e4f`