Skip to main content

Extension for accessing the ClueWeb22 via ir_datasets.

Project description

PyPi CI Code coverage Python Issues Commit activity Downloads License

💾 ir-datasets-clueweb22

Extension for accessing the ClueWeb22 via ir_datasets.

Installation

Install the package from PyPI:

pip install ir-datasets-clueweb22

Usage

Using this extension is simple. Just register the additional datasets by calling register(). Then you can load the datasets with ir_datasets as usual:

from ir_datasets import load
from ir_datasets_clueweb22 import register

# Register the ClueWeb22 datasets.
register()
# Use ir_datasets as usual.
dataset = load("clueweb22/b")

If you want to use the CLI, just use the ir_datasets_clueweb22 instead of ir_datasets. All CLI commands will work as usual, e.g., to list the available datasets:

ir_datasets_clueweb22 list

Development

To build this package and contribute to its development you need to install the build, setuptools, and wheel packages (pre-installed on most systems):

pip install build setuptools wheel

Create and activate a virtual environment:

python3.10 -m venv venv/
source venv/bin/activate

Dependencies

Install the package and test dependencies:

pip install -e .[tests]

Testing

Verify your changes against the test suite to verify.

ruff check .                   # Code format and LINT
mypy .                         # Static typing
bandit -c pyproject.toml -r .  # Security
pytest .                       # Unit tests

Please also add tests for your newly developed code.

Build wheels

Wheels for this package can be built with:

python -m build

Support

If you have any problems using this package, please file an issue. We're happy to help!

License

This repository is released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ir_datasets_clueweb22-0.1.0.tar.gz (28.8 kB view hashes)

Uploaded Source

Built Distribution

ir_datasets_clueweb22-0.1.0-py3-none-any.whl (27.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page