Skip to main content

pyclustring is a python data mining library

Project description

Introduction

PyClustring 0.6 library is a collection of cluster analysis, graph coloring, travelling salesman problem algorithms, oscillatory and neural network models, containers, tools for visualization and result analysis, etc. High performance is ensured by CCORE library that is a part of the pyclustering library where almost the same algorithms, models, tools are implemented. There is ability to use python code implementation only or CCORE (C/C++) implementation using special flag. CCORE library does not use python.h interface to communicate with python code due to requirement to save ability to use CCORE library or C/C++ code part of CCORE in other projects.

Library content

PyClustering consists of six general modules where the algorithms, models, tools are placed.

Cluster analysis algorithms (module pyclustering.cluster):

  • Agglomerative (pyclustering.cluster.agglomerative);

  • BIRCH (pyclustering.cluster.birch);

  • CLARANS (pyclustering.cluster.clarans);

  • CURE (pyclustering.cluster.cure);

  • DBSCAN (pyclustering.cluster.dbscan);

  • HSyncNet (bio-inspired algorithm pyclustering.cluster.hsyncnet);

  • K-Means (pyclustering.cluster.kmeans);

  • K-Medians (pyclustering.cluster.kmedians);

  • K-Medoids (pyclustering.cluster.kmedoids);

  • OPTICS (pyclustering.cluster.optics);

  • ROCK (pyclustering.cluster.rock);

  • SyncNet (bio-inspired algorithm pyclustering.cluster.syncnet)

  • SyncSOM (bio-inspired algorithm pyclustering.cluster.syncsom)

  • X-Means (pyclustering.cluster.xmeans);

Oscillatory and neural network models (module pyclustering.nnet):

  • Oscillatory network based on Hodgkin-Huxley model (pyclustering.nnet.hhn);

  • Hysteresis Oscillatory Network (pyclustering.nnet.hysteresis);

  • LEGION: Local Excitatory Global Inhibitory Oscillatory Network (pyclustering.nnet.legion);

  • PCNN: Pulse-Coupled Neural Network (pyclustering.nnet.pcnn);

  • SOM: Self-Organized Map (pyclustering.nnet.som);

  • Sync: Oscillatory Network based on Kuramoto model (pyclustering.nnet.sync);

  • SyncPR: Oscillatory Network based on Kuramoto model for pattern recognition (pyclustering.nnet.syncpr);

  • SyncSegm: Oscillatory Network based on Kuramoto model for image segmentation (pyclustering.nnet.syncsegm);

Graph coloring algorithms (module pyclustering.gcolor):

  • DSATUR (pyclustering.gcolor.dsatur);

  • Hysteresis Oscillatory Network for graph coloring (pyclustering.gcolor.hysteresis);

  • SyncGColor: Oscillatory Network based on Kuramoto model for graph coloring (pyclustering.gcolor.sync);

Containers (module pyclustering.container):

  • CF-Tree (pyclustering.container.cftree);

  • KD-Tree (pyclustering.container.kdtree);

Travelling Salesman Problem Algorithms (module pyclustering.tsp):

  • AntColony (pyclustering.tsp.antcolony);

Utils that can be used for analysis, visualization, etc are placed in module pyclustering.utils.

Installation

The simplest way to install pyclustering library:

$ pip3 install pyclustering

The library can be compiled and manually installed on linux machine wherever you want:

# compile CCORE library (core of the pyclustering library).
$ cd pyclustering/ccore
$ make ccore

# return to parent folder of the pyclustering library
cd ../

# add current folder to python path
PYTHONPATH=`pwd`
export PYTHONPATH=${PYTHONPATH}

The library CCORE for 64-bit windows is distributed with pyclustering library so there is no need to re-built it. If you want to re-built CCORE library you can open CCORE Microsoft Visual Studio project that is located in ccore/ folder and compile it.

Examples

The library provides intuitive and friendly interface, cluster analysis can be easily performed:

# an example of clustering by BIRCH algorithm.
from pyclustering.cluster.birch import birch;

from pyclustering.utils import read_sample;

# load data from the FCPS set that is provided by the library.
sample = read_sample(FCPS_SAMPLES.SAMPLE_LSUN);

# create BIRCH algorithm for allocation three objects.
birch_instance = birch(sample, 3);

# start processing - cluster analysis of the input data.
birch_instance.process();

# allocate clusters.
clusters = birch_instance.get_clusters();

# visualize obtained clusters.
visualizer = cluster_visualizer();
visualizer.append_clusters(clusters, sample);
visualizer.show();

Clustering algorithms can be used for image processing:

# an example of image color segmentation.
from pyclustering.utils import draw_image_mask_segments, read_image;

from pyclustering.samples.definitions import IMAGE_SIMPLE_SAMPLES;

from pyclustering.cluster.kmeans import kmeans;

# load image from the pyclustering collection.
data = read_image(IMAGE_SIMPLE_SAMPLES.IMAGE_SIMPLE_BEACH);

# set initial centers for K-Means algorithm.
start_centers = [ [153, 217, 234, 128], [0, 162, 232, 128], [34, 177, 76, 128], [255, 242, 0, 128] ];

# create K-Means algorithm instance.
kmeans_instance = kmeans(data, start_centers);

# start processing.
kmeans_instance.process();

# obtain clusters that are considered as segments.
segments = kmeans_instance.get_clusters();

# show image segmentation results.
draw_image_mask_segments(IMAGE_SIMPLE_SAMPLES.IMAGE_SIMPLE_BEACH, segments);

Simulation of oscillatory network based on Hodgkin-Huxley neuron model where six synchronous ensembles of oscillators are formed. It means that three features from input data are allocated where each feature is encoded by only one ensemble.

# an example of simulation of oscillatory network based on Hodgkin-Huxley model
from pyclustering.utils import draw_dynamics;

from pyclustering.nnet.hhn import hhn_network, hhn_parameters;

# set period of 400 time units when high strength value of synaptic connection exists from CN2 to PN.
params = hhn_parameters();
params.deltah = 400;

# prepare external stimulus that encode three different features.
stimulus = [0, 0, 25, 25, 47, 47];

# create oscillatory network that has six oscillators.
net = hhn_network(len(stimulus), stimulus, params);

# perform simulation during 1200 steps in 600 time units.
(t, dyn) = net.simulate(1200, 600);

# visualize results of simulation (output dynamic of the network).
draw_dynamics(t, dyn, x_title = "Time", y_title = "V", separate = True);

Proposals, questions, bugs:

In case of any questions, proposals or bugs related to the pyclustering please contact to pyclustering@yandex.ru or create an issue here: https://github.com/annoviko/pyclustering/issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyclustering-0.6.6.tar.gz (3.3 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page