dset · PyPI

Disjoint-Set implementation with heuristic optimizations (union-by-rank and path-compression)

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

A Disjoint-Set implementation which makes use of the the “Union by Rank” and “Path Compression” heuristic optimizations while being as simple and effective as possible.

It support the following operations:

Find-Set in O(1) amortized running time
Union in O(1) amortized running time

Check end notes for additional information about running times

Why?

Sometimes, specially during algorithm competitions, a Union-Find data structure is needed to solve a certain problem (e.g. when implementing Kruskal’s MST algorithm) In those cases, we usually use whatever off-the-shelf implementation we happen to find in PyPI and, many times, we get the infamous submission fail Time Limit Exceed(TLE) because that specific Disjoint-Set implementation is not fast enough…

Worry not! A fast and simple enough (you can copy paste the code here) disjoint-set implementation is now available:

Installation

$ pip install dset

Usage

>>> import dset
>>> s1 = dset.Set()                 # To create new sets objects (it runs in O(1))
>>> s2 = dset.Set()
>>> s3 = dset.Set()
>>> dset.groups()                   # To check the number of different groups (it runs in O(1))
3
>>> dset.find(s1) == dset.find(s2)  # To check if two sets are in the same group
False
>>> dset.union(s1, s2)              # to group s1 and s2 sets together
>>> dset.groups()                   # Now there are only 2 groups: s1-s2 and s3
2
>>> dset.find(s1) == dset.find(s2)  # They are in the same group s1-s2
True
>>> dset.find(s2) == dset.find(s3)  # They are in distinct groups s1-s2 and s3
False
>>> dset.union(s2, s1)              # do nothing (no need to check if s1 and s2 belong to the same group)
>>> dset.groups()                   # Same as before only 2: s1-s2 and s3
2
>>> dset.union(s1, s3)              # group s1-s2 and s3 together.
>>> dset.groups()                   # Now there is only one group: s1-s2-s3
1
>>> dset.find(s1) == dset.find(s2)  # All the sets are in the same groups s1-s2-s3
... dset.find(s2) == dset.find(s3)
... dset.find(s1) == dset.find(s3)
True
True
True

Future improvements

Make official Sphinx Documentation. (minor releases)
Make test suite. (minor releases)
Change to a list-based implementation (major-minor release)
- Improve running time (constant terms)
- Improve memory space (constant terms)
Make the package C-based. (major-minor release)
- Improve running time (constant terms)

Notes on Complexity Running Times

Separately, either “Union by Rank” or “Path Compression” improves the running time of the operations on disjoint-sets. Alone, “Union by Rank” yields a running time of O(m log n), where m is the number of Union and Find-Set operations and n is the number of sets.

The improvement is even greater when we use the two heuristics together; the worst-case running time is O(m f(n)), where f(n) is a very slowly growing function, the Inverse Ackermann function (e.g. for n equal to the number atoms in the universe, f(n) is only 4) So for any conceivable application, we can consider the running time of m Union + Find-Set operations to be O(m) and therefore both Union and Find-Set operations have O(1) amortized running time in practice.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.1

Aug 23, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dset-0.0.1.tar.gz (5.1 kB view hashes)

Uploaded Aug 23, 2019 Source

Built Distribution

dset-0.0.1-py3-none-any.whl (4.0 kB view hashes)

Uploaded Aug 23, 2019 Python 3

Hashes for dset-0.0.1.tar.gz

Hashes for dset-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`cd65b3260908a2218dd41252c361021aea7a5da8c72079b2b1672e54faf8e781`
MD5	`2b47fe2fec195c7b8e85d33cece7b7d3`
BLAKE2b-256	`5d645aeecb73379149f1a2e53e946d6083c1bed11f489597ca14aaafb3ba5242`

Hashes for dset-0.0.1-py3-none-any.whl

Hashes for dset-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fdca85c0ba033725006a9617383a26f1023230bb8bbca69f4ab25d824599b838`
MD5	`849984553bdca4cb095a9b41425c79a2`
BLAKE2b-256	`60a797c81486d66cd7ec47d413f128c66dd00dbc51bee32e6165e05310561703`