Skip to main content

A namespace package for data files

Project description

What's a Python Namespace Package, and What's It For?

DataRepos - A Namespace Package for Data Files

This package provides a namespace intended for data files.

It lets you read data files from a common package, but with the actual data distributed over multiple packages.

For example, if you have various data files that are very large (on the order of hundreds of megabytes or larger), then you can divide the data files into different packages, but they'll be callable from the same namespace in your Python code.

To use DataRepos, you should install it into your virtual environment:

(venv) $ python -m pip install data-repos

You can now import DataRepos as data_repos.

Read Data Files

DataRepos provides a read() function that you can use to read data files. The data are returned as a pandas DataFrame:

>>> from data_repos import read
>>> read.data("countries")
              country   population
0             Austria      8840521
1              Canada     37057765
2                Cuba     11338138
3  Dominican Republic     10627165
4             Germany     82905782
5              Norway      5311916
...

Install Data Files

You can install other data files from PyPI with pip. Other cooperating DataRepos packages can be installed and will integrate smoothly:

(venv) $ python -m pip install data-repos-cars

You can then read the cars dataset with the exact same syntax:

>>> from data_repos import read
>>> read.data("cars")
...

Because DataRepos is a namespace package, it can be extended on the fly.

Available Data Files

Two datasets are included as examples:

  • iris: The classical Iris dataset, originally published by Ronald Fisher in 1936
  • countries: Countries and their population, collected by Samayo

You can read these files with read.data("iris") and read.data("countries"), respectively.

Add Your Own Data Files

You can also add your own data files by storing them in a folder named data_repos that's on Python's path.

See examples of how to do this and learn more about namespace packages in What's a Python Namespace Package, and What's It For?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-repos-1.0.0.tar.gz (11.2 kB view hashes)

Uploaded Source

Built Distribution

data_repos-1.0.0-py3-none-any.whl (8.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page