Skip to main content

cloud-native dataset library for accessing histopathological datasets

Project description

PADO: PAthological Data Obsession

PyPI Version Conda (channel only) Read the Docs GitHub Workflow Status PyPI - Python Version GitHub issues Milestones

Welcome to pado :wave:, a dataset library for accessing histopathological datasets in a standardized way from Python.

pado's goal is to provide a unified way to access data from diverse datasets. Its scope is very small and the design tries to keep everything simple.

As always: If pado is not pythonic, unintuitive, slow or if its documentation is confusing, it's a bug in pado. Feel free to report any issues or feature requests in the issue tracker!

Development happens on github :octocat:

Quickstart

To quickly get a pado dataset, for testing and familiarizing with the interface you can create a fake dataset, that's also used in the internal tests.

>>> from pado.mock import mock_dataset
>>> ds = mock_dataset(None)
>>> ds
PadoDataset('memory://pado-f5869e41-5246-4378-9057-96fda1c40edf', mode='r+')

This creates a test dataset in memory with 3 images and some fake metadata

>>> len(ds)
3
>>> ds.index
(ImageId('mock_image_0.svs', site='mock'),
 ImageId('mock_image_1.svs', site='mock'),
 ImageId('mock_image_2.svs', site='mock'))
>>> ds[0].image
Image(...)
>>> ds[0].metadata
                                          A  B  C  D
ImageId('mock_image_0.svs', site='mock')  a  2  c  4

Documentation

The documentation is currently provided in this repository and has to be build via sphinx. It'll be available online soon.

To build it, in the repository root, run

python -m pip install -e ".[docs]"
cd docs
make html

Access the documentation then at docs/build/html/index.html

Development Installation

pado can be installed directly via pip:

pip install "git+https://github.com/Bayer-Group/pado@main#egg=pado[cli,create]"

or for development you can clone and install via:

git clone https://github.com/Bayer-Group/pado.git
cd pathdrive-pado
pip install -e ".[cli,create,dev]"

if you prefer conda environments:

git clone https://github.com/Bayer-Group/pado.git
cd pathdrive-pado
conda install conda-devenv
conda devenv
conda activate pado

Note that in this environment pado is already installed in development mode, so go ahead and hack.

Contributing Guidelines

  • Please use numpy docstrings.
  • When contributing code, please try to use Pull Requests.
  • tests go hand in hand with modules on tests packages at the same level. We use pytest.
  • Please install pre-commit and install the hooks by running pre-commit install in the project root folder.

You can setup your IDE to help you adhering to these guidelines.
(Santi is happy to help you setting up pycharm in 5 minutes)

Acknowledgements

Build with love by Santi Villalba and Andreas Poehlmann from the Machine Learning Research group at Bayer.

pado: copyright 2020-2022 Bayer AG

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pado-0.12.0.tar.gz (118.2 kB view hashes)

Uploaded Source

Built Distribution

pado-0.12.0-py3-none-any.whl (108.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page