General deep learing utility library

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

DeepZensols Deep Learning Framework

This deep learning library was designed to provide consistent and reproducible results.

See the full documentation.
Paper on arXiv.

Features:

Easy to configure and framework to allow for programmatic debugging of neural networks.
Reproducibility of results
- All random seed state is persisted in the trained model files.
- Persisting of keys and key order across train, validation and test sets.
Analysis of results with complete metrics available.
A vectorization framework that allows for pickling tensors.
Additional layers:
- Full BiLSTM-CRF and stand-alone CRF implementation using easy to configure constituent layers.
- Easy to configure N [deep convolution layer] with automatic dimensionality calculation and configurable pooling and batch centering.
- Convolutional layer factory with dimensionality calculation.
- Recurrent layers that abstracts RNN, GRU and LSTM.
- N deep linear layers.
- Each layer's configurable with activation, dropout and batch normalization.
Pandas integration to data load, easily manage vectorized features, and report results.
Multi-process for time consuming CPU feature vectorization requiring little to no coding.
Resource and tensor deallocation with memory management.
Real-time performance and loss metrics with plotting while training.
Thorough unit test coverage.
Debugging layers using easy to configure Python logging module and control points.

Much of the code provides convenience functionality to PyTorch. However, there is functionality that could be used for other deep learning APIs.

Documentation

See the full documentation.

Obtaining

The easiest way to install the command line program is via the pip installer:

pip3 install zensols.deeplearn

Binaries are also available on pypi.

Workflow

This package provides a workflow for processing features, training and then testing a model. A high level outline of this process follows:

Container objects are used to represent and access data as features.
Instances of data points wrap the container objects.
Vectorize the features of each data point in to tensors.
Store the vectorized tensor features to disk so they can be retrieved quickly and frequently.
At train time, load the vectorized features in to memory and train.
Test the model and store the results to disk.

To jump right in, see the examples section. However, it is better to peruse the in depth explanation with the Iris example code follows:

The initial data processing, which includes data representation to batch creation.
Creating and configuring the model.
Using a facade to train, validate and test the model.
Analysis of results, including training/validation loss graphs and performance metrics.

Examples

The Iris example (also see the Iris example configuration) is the most basic example of how to use this framework. This example is detailed in the workflow documentation in detail.

There are also examples in the form of Juypter notebooks as well, which include the:

Iris notebook data set, which is a small data set of flower dimensions as a three label classification,
MNIST notebook for the handwritten digit data set,
debugging notebook.

Attribution

This project, or example code, uses:

PyTorch as the underlying framework.
Branched code from Torch CRF for the CRF class.
pycuda for Python integration with CUDA.
scipy for scientific utility.
Pandas for prediction output.
matplotlib for plotting loss curves.

Corpora used include:

Torch CRF

The CRF class was taken and modified from Kemal Kurniawan's pytorch_crf GitHub repository. See the README.md module documentation for more information. This module was forked pytorch_crf with modifications. However, the modifications were not merged and the project appears to be inactive.

Important: This project will change to use it as a dependency pending merging of the changes needed by this project. Until then, it will remain as a separate class in this project, which is easier to maintain as the only class/code is the CRF class.

The pytorch_crf repository uses the same license as this repository, which the MIT License. For this reason, there are no software/package tainting issues.

Citation

If you use this project in your research please use the following BibTeX entry:

@article{Landes_DiEugenio_Caragea_2021,
  title={DeepZensols: Deep Natural Language Processing Framework},
  url={http://arxiv.org/abs/2109.03383},
  note={arXiv: 2109.03383},
  journal={arXiv:2109.03383 [cs]},
  author={Landes, Paul and Di Eugenio, Barbara and Caragea, Cornelia},
  year={2021},
  month={Sep}
}

Community

Please star the project and let me know how and where you use this API. Contributions as pull requests, feedback and any input is welcome.

Changelog

An extensive changelog is available here.

License

MIT License

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.11.1

Mar 14, 2024

1.11.0

Mar 7, 2024

1.10.0

Feb 27, 2024

1.9.0

Dec 6, 2023

1.8.1

Aug 17, 2023

1.8.0

Aug 16, 2023

1.7.0

Jun 9, 2023

1.6.1

Jun 8, 2023

1.6.0

Feb 2, 2023

1.5.2

Nov 6, 2022

1.5.1

Oct 2, 2022

1.5.0

Oct 1, 2022

1.4.0

Aug 9, 2022

1.3.0

Jun 14, 2022

1.2.0

May 15, 2022

1.1.1

May 4, 2022

This version

1.1.0

May 4, 2022

1.0.0

Feb 12, 2022

0.1.8

Jan 26, 2022

0.1.7

Jan 25, 2022

0.1.6

Oct 22, 2021

0.1.5

Sep 21, 2021

0.1.4

Sep 8, 2021

0.1.3

Aug 7, 2021

0.1.2

Apr 29, 2021

0.1.1

Dec 29, 2020

0.1.0

Dec 10, 2020

0.0.6

May 11, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

zensols.deeplearn-1.1.0-py3.9.egg (369.9 kB view hashes)

Uploaded May 4, 2022 Source

zensols.deeplearn-1.1.0-py3-none-any.whl (155.5 kB view hashes)

Uploaded May 4, 2022 Python 3

Hashes for zensols.deeplearn-1.1.0-py3.9.egg

Hashes for zensols.deeplearn-1.1.0-py3.9.egg
Algorithm	Hash digest
SHA256	`78cf1b5ca3be76048d37f024bde4ddd4f2a89d575e6d9064adabee852c6ddb42`
MD5	`00ce50a2a070fbb62a38385fa77dd854`
BLAKE2b-256	`ad24836e36cbe6cf2d6191d58ed2adf35cb3034665be5c5e4c784e7020a26db6`

Hashes for zensols.deeplearn-1.1.0-py3-none-any.whl

Hashes for zensols.deeplearn-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2668e32ae60e53c7b3ca89961310c1c337552aaa85a5b28d97cbb04203aa6233`
MD5	`3582a489ed71f38e186118ff266ae2a3`
BLAKE2b-256	`fcebed0f6a0779595a5b795d565435b0ca9038e885b02cbbd49a44b547342587`