Skip to main content

A hierarchical divisive clustering toolbox

Project description

PyPI PyPI - Python Version example workflow codecov Codacy Badge License: MIT DOI

HiPart: Hierarchical divisive clustering toolbox

This repository presents the HiPart package, an open-source native python library that provides efficient and interpretable implementations of divisive hierarchical clustering algorithms. HiPart supports interactive visualizations for the manipulation of the execution steps allowing the direct intervention of the clustering outcome. This package is highly suited for Big Data applications as the focus has been given to the computational efficiency of the implemented clustering methodologies. The dependencies used are either Python build-in packages or highly maintained stable external packages. The software is provided under the MIT license.

Installation

For the installation of the package, the only necessary actions and requirements are a version of Python higher or equal to 3.8 and the execution of the following command.

pip install HiPart

Simple Example Execution

The example bellow is the simplest form of the package's execution. Shortly, it shows the creation of synthetic clustering dataset containing 6 clusters. Afterwards it is clustered with the DePDDP algorithm and only the cluster labels are returned.

from HiPart.clustering import DePDDP
from sklearn.datasets import make_blobs

X, y = make_blobs(n_samples=1500, centers=6, random_state=0)

clustered_class = DePDDP(max_clusters_number=6).fit_predict(X)

Users can find complete execution examples for all the algorithms of the HiPart package in the clustering_example file of the repository. Also, the users can find a KernelPCA method usage example in the clustering_with_kpca_example file of the repository. Finally, the file interactive_visualization_example contains an example execution of the interactive visualization. The instructions for the interactive visualization GUI can be found with the execution of this visualization.

Documentation

The full documentation of the package can be found here.

Citation

@article{Anagnostou2023HiPart,
  title = {HiPart: Hierarchical Divisive Clustering Toolbox},
  author = {Panagiotis Anagnostou and Sotiris Tasoulis and Vassilis P. Plagianakos and Dimitris Tasoulis},
  year = {2023},
  journal = {Journal of Open Source Software},
  publisher = {The Open Journal},
  volume = {8},
  number = {84},
  pages = {5024},
  doi = {10.21105/joss.05024},
  url = {https://doi.org/10.21105/joss.05024}
} 

Acknowledgments

This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI), under grant agreement No 1901.

Collaborators

Dimitris Tasoulis :email: Panagiotis Anagnostou :email: Sotiris Tasoulis :email: Vassilis Plagianakos :email:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

HiPart-0.3.2.tar.gz (36.5 kB view hashes)

Uploaded Source

Built Distribution

HiPart-0.3.2-py3-none-any.whl (37.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page