Utilities for vision related tasks

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Optical

ci build Python Version GitHub all releases PyPI - License

logo

A collection of utilities for ML vision related tasks.

What is optical?

Object detection is one of the mainstream computer vision tasks. However, when it comes to training an object detection model, there is a variety of formats that one has to deal with for different models e.g. COCO, PASCAL VOC, Yolo and so on. optical provides a simple interface to convert back and forth between these annotation formats and also perform a bunch of exploratory data analysis (EDA) on these datasets regardless of their source format.

:star2: At present we support the following formats:

Installation

optical could be installed from pip:

pip install optical

For conversion to (or from) TFrecord, please install the tensorflow extra:

pip install `optical[tensorflow]`

for visualisation of images in mediapy format, you need to have ffmpeg installed in your system.

Getting Started

declare the imports

from optical import Annotation

read the annotations

annotation = Annotation(root = "/path/to/dataset", format="coco")

optical expects the data to be organised in either of the following layouts:

root
├── images
│ ├── train
│ ├── val
│ └── test
└── annotations
  ├── train.json
  ├── val.json
  └── test.json

Note that for annotation formats which require individual annotations for each images (e.g., PASCAL VOC or Yolo), the annotations directory should also contain the same sub-directories as in images. The splits that do not have an annotation will be ignored.

If your data does not have a split to begin with, that's acceptable too. In that case the directory layout should be like below:

root
├── images
│ ├── 1.jpg
│ ├── 2.jpg
│ ├── ...
│ │
│ └── 100.jpg
│
└── annotations
  └── label.json

Tha name of the annotation file is not important in this case. But, if your format requires individual formats, the annotation files must have the identical name with that of the image.

EDA

Check data distribution

>>> annotation.describe()

| split | images | annotations | categories |
| ----- | ------ | ----------- | ---------- |
| train | 729    | 1121        | 3          |
| valid | 250    | 322         | 3          |

Plot label distribution

>>> annotation.show_distribition()

logo

Scatter bounding box width and height

>>> annotation.bbox_scatter()

logo

Visualize images

>>> vis = annotation.visualizer(img_size=256)
>>> vis.show_batch()

logo

Split the data if required

>>> splits = annotation.train_test_split(test_size = 0.2, stratified = True)
>>> splits.save("/path/to/output/dir")

Export to other formats

>>> annotation.export(to = "yolo")

Contributing

Work in local environment:

Fork the repo

install poetry:

curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -

work on virtual environment:
```
conda create -n optical python=3.8 pip
```
install the dependencies and the project in editable mode
```
poetry install
```
Make your changes as required. Please use appropriate use of docstrings (we follow Google style docstring) and try to keep your code clean.
Raise a pull request.

Work inside the dev container:

If you are a Visual Studio Code user, you may choose to develop inside a container. The benefit is the container comes with all necessary settings and dependencies configured. You will need Docker installed in your system. You also need to have the Remote - Containers extension enabled.

Open the project in Visual Studio Code. in the status bar, select open in remote container.

It will perhaps take a few minutes the first time you build the container.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.2

Oct 14, 2021

0.0.1

Apr 13, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optical-0.0.2.tar.gz (31.3 kB view hashes)

Uploaded Oct 14, 2021 Source

Built Distribution

optical-0.0.2-py3-none-any.whl (38.8 kB view hashes)

Uploaded Oct 14, 2021 Python 3

Hashes for optical-0.0.2.tar.gz

Hashes for optical-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`470a588709278db71b8b4083d177e1137bcac7cc9eec6cdb48f918f00eb4ff1a`
MD5	`6a8359a5586dd371f081bd5cfa962978`
BLAKE2b-256	`b306215c5945f3846903263743a0fede941e625b78a25826b3918849167cb85c`

Hashes for optical-0.0.2-py3-none-any.whl

Hashes for optical-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5be1a218fe7d9b361147a6d28debdd40a884cba101391d6a66cc4d8c650b4f4f`
MD5	`1ae4c7d24533b8a28636b72920f94699`
BLAKE2b-256	`ba059cf4451b6420604844864271a910797ba1eefb9a44cb7a64df7340be2620`