Skip to main content
PyCon US is happening May 14th-22nd in Pittsburgh, PA USA.  Learn more

Big dada: Download data from lots of open data platforms.

Project description

This goes through all the datasets I know about.

for dataset in pluplusch():
    print(dataset)

You can tune it a bit with the parameters.

pluplusch(catalogs = ['http://data.enseignementsup-recherche.gouv.fr'],
          cache_dir = '/lockers/tlevine_vol/dadawarehouse.thomaslevine.com/big/pluplusch')

If you want to save the data catalog metadata every so often, you might write a crontab that looks like this.

@weekly pluplusch --cache-dir ~/$(date --rfc-3339=date)

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page