intervaltree 2.1.0
Editable interval tree data structure for Python 2 and 3
intervaltree
A mutable, selfbalancing interval tree for Python 2 and 3. Queries may be by point, by range overlap, or by range envelopment.
This library was designed to allow tagging text and time intervals, where the intervals include the lower bound but not the upper bound.
Installing
pip install intervaltree
Features
 Supports Python 2.6+ and Python 3.2+
 Initializing
 blank tree = IntervalTree()
 from an iterable of Interval objects (tree = IntervalTree(intervals))
 from an iterable of tuples (tree = IntervalTree.from_tuples(interval_tuples))
 Insertions
 tree[begin:end] = data
 tree.add(interval)
 tree.addi(begin, end, data)
 Deletions
 tree.remove(interval) (raises ValueError if not present)
 tree.discard(interval) (quiet if not present)
 tree.removei(begin, end, data) (short for tree.remove(Interval(begin, end, data)))
 tree.discardi(begin, end, data) (short for tree.discard(Interval(begin, end, data)))
 tree.remove_overlap(point)
 tree.remove_overlap(begin, end) (removes all overlapping the range)
 tree.remove_envelop(begin, end) (removes all enveloped in the range)
 Overlap queries
 tree[point]
 tree[begin:end]
 tree.search(point)
 tree.search(begin, end)
 Envelop queries
 tree.search(begin, end, strict=True)
 Membership queries
 interval_obj in tree (this is fastest, O(1))
 tree.containsi(begin, end, data)
 tree.overlaps(point)
 tree.overlaps(begin, end)
 Iterable
 for interval_obj in tree:
 tree.items()
 Sizing
 len(tree)
 tree.is_empty()
 not tree
 tree.begin() (the begin coordinate of the leftmost interval)
 tree.end() (the end coordinate of the rightmost interval)
 Setlike operations
 union
 result_tree = tree.union(iterable)
 result_tree = tree1  tree2
 tree.update(iterable)
 tree = other_tree
 difference
 result_tree = tree.difference(iterable)
 result_tree = tree1  tree2
 tree.difference_update(iterable)
 tree = other_tree
 intersection
 result_tree = tree.intersection(iterable)
 result_tree = tree1 & tree2
 tree.intersection_update(iterable)
 tree &= other_tree
 symmetric difference
 result_tree = tree.symmetric_difference(iterable)
 result_tree = tree1 ^ tree2
 tree.symmetric_difference_update(iterable)
 tree ^= other_tree
 comparison
 tree1.issubset(tree2) or tree1 <= tree2
 tree1 <= tree2
 tree1.issuperset(tree2) or tree1 > tree2
 tree1 >= tree2
 tree1 == tree2
 union
 Restructuring
 chop(begin, end) (slice intervals and remove everything between begin and end)
 slice(point) (slice intervals at point)
 split_overlaps() (slice at all interval boundaries)
 Copying and typecasting
 IntervalTree(tree) (Interval objects are same as those in tree)
 tree.copy() (Interval objects are shallow copies of those in tree)
 set(tree) (can later be fed into IntervalTree())
 list(tree) (ditto)
 Picklefriendly
 Automatic AVL balancing
Examples
Getting started
>>> from intervaltree import Interval, IntervalTree >>> t = IntervalTree() >>> t IntervalTree()
Adding intervals  any object works!
>>> t[1:2] = "12" >>> t[4:7] = (4, 7) >>> t[5:9] = {5: 9}
Query by point
The result of a query is a set object, so if ordering is important,you must sort it first.>>> sorted(t[6]) [Interval(4, 7, (4, 7)), Interval(5, 9, {5: 9})] >>> sorted(t[6])[0] Interval(4, 7, (4, 7))
Query by range
Note that ranges are inclusive of the lower limit, but noninclusive of the upper limit. So:
>>> sorted(t[2:4]) []
But:
>>> sorted(t[1:5]) [Interval(1, 2, '12'), Interval(4, 7, (4, 7))]
Accessing an Interval object
>>> iv = Interval(4, 7, (4, 7)) >>> iv.begin 4 >>> iv.end 7 >>> iv.data (4, 7) >>> begin, end, data = iv >>> begin 4 >>> end 7 >>> data (4, 7)
Constructing from lists of intervals
We could have made a similar tree this way:
>>> ivs = [(1, 2), (4, 7), (5, 9)] >>> t = IntervalTree( ... Interval(begin, end, "%d%d" % (begin, end)) for begin, end in ivs ... )
Or, if we don’t need the data fields:
>>> t2 = IntervalTree(Interval(*iv) for iv in ivs)
Removing intervals
>>> t.remove( Interval(1, 2, "12") ) >>> sorted(t) [Interval(4, 7, '47'), Interval(5, 9, '59')] >>> t.remove( Interval(500, 1000, "Doesn't exist")) # raises ValueError Traceback (most recent call last): ValueError >>> t.discard(Interval(500, 1000, "Doesn't exist")) # quietly does nothing >>> del t[5] # same as t.remove_overlap(5) >>> t IntervalTree()
We could also empty a tree entirely:
>>> t2.clear() >>> t2 IntervalTree()
Or remove intervals that overlap a range:
>>> t = IntervalTree([ ... Interval(0, 10), ... Interval(10, 20), ... Interval(20, 30), ... Interval(30, 40)]) >>> t.remove_overlap(25, 35) >>> sorted(t) [Interval(0, 10), Interval(10, 20)]
We can also remove only those intervals completely enveloped in a range:
>>> t.remove_envelop(5, 20) >>> sorted(t) [Interval(0, 10)]
Chopping
We could also chop out parts of the tree:
>>> t = IntervalTree([Interval(0, 10)]) >>> t.chop(3, 7) >>> sorted(t) [Interval(0, 3), Interval(7, 10)]
To modify the new intervals’ data fields based on which side of the interval is being chopped:
>>> def datafunc(iv, islower): ... oldlimit = iv[islower] ... return "oldlimit: {0}, islower: {1}".format(oldlimit, islower) >>> t = IntervalTree([Interval(0, 10)]) >>> t.chop(3, 7, datafunc) >>> sorted(t)[0] Interval(0, 3, 'oldlimit: 10, islower: True') >>> sorted(t)[1] Interval(7, 10, 'oldlimit: 0, islower: False')
Slicing
You can also slice intervals in the tree without removing them:
>>> t = IntervalTree([Interval(0, 10), Interval(5, 15)]) >>> t.slice(3) >>> sorted(t) [Interval(0, 3), Interval(3, 10), Interval(5, 15)]
You can also set the data fields, for example, reusing datafunc() from above:
>>> t = IntervalTree([Interval(5, 15)]) >>> t.slice(10, datafunc) >>> sorted(t)[0] Interval(5, 10, 'oldlimit: 15, islower: True') >>> sorted(t)[1] Interval(10, 15, 'oldlimit: 5, islower: False')
Future improvements
See the issue tracker on GitHub.
Based on
 Eternally Confuzzled’s AVL tree
 Wikipedia’s Interval Tree
 Heavily modified from Tyler Kahn’s Interval Tree implementation in Python (GitHub project)
 Incorporates contributions from:
 konstantint/Konstantin Tretyakov of the University of Tartu (Estonia)
 siniG/Avi Gabay
 lmcarril/Luis M. Carril of the Karlsruhe Institute for Technology (Germany)
Copyright
 ChaimLeib Halbert, 20132015
 Modifications, Konstantin Tretyakov, 2014
Licensed under the Apache License, version 2.0.
The source code for this project is at https://github.com/chaimleib/intervaltree
Change log
Version 2.1.0
 Added:
 merge_overlaps() method and tests
 merge_equals() method and tests
 range() method
 span() method, for returning the difference between end() and begin()
 Fixes:
 Development version numbering is changing to be compliant with PEP440. Version numbering now contains major, minor and micro release numbers, plus the number of builds following the stable release version, e.g. 2.0.4b34
 Speed improvement: begin() and end() methods used iterative min() and max() builtins instead of the more efficient iloc member available to SortedDict
 overlaps() method used to return True even if provided null test interval
 Maintainers:
 Added coverage test (make coverage) with html report (htmlcov/index.html)
 Tests run slightly faster
Version 2.0.4
 Fix: Issue #27: README incorrectly showed using a comma instead of a colon when querying the IntervalTree: it showed tree[begin, end] instead of tree[begin:end]
Version 2.0.3
 Fix: README showed using + operator for setlike union instead of the correct  operator
 Removed tests from release package to speed up installation; to get the tests, download from GitHub
Version 2.0.2
 Fix: Issue #20: performance enhancement for large trees. IntervalTree.search() made a copy of the entire boundary_table resulting in linear search time. The sortedcollections package is now the sole install dependency
Version 2.0.1
 Fix: Issue #26: failed to prune empty Node after a rotation promoted contents of s_center
Version 2.0.0
 IntervalTree now supports the full collections.MutableSet API
 Added:
 __delitem__ to IntervalTree
 Interval comparison methods lt(), gt(), le() and ge() to Interval, as an alternative to the comparison operators, which are designed for sorting
 IntervalTree.from_tuples(iterable)
 IntervalTree.clear()
 IntervalTree.difference(iterable)
 IntervalTree.difference_update(iterable)
 IntervalTree.union(iterable)
 IntervalTree.intersection(iterable)
 IntervalTree.intersection_update(iterable)
 IntervalTree.symmetric_difference(iterable)
 IntervalTree.symmetric_difference_update(iterable)
 IntervalTree.chop(a, b)
 IntervalTree.slice(point)
 Deprecated IntervalTree.extend() – use update() instead
 Internal improvements:
 More verbose tests with progress bars
 More tests for comparison and sorting behavior
 Code in the README is included in the unit tests
 Fixes
 BACKWARD INCOMPATIBLE: On ranged queries where begin >= end, the query operated on the overlaps of begin. This behavior was documented as expected in 1.x; it is now changed to be more consistent with the definition of Intervals, which are halfopen.
 Issue #25: pruning empty Nodes with staggered descendants could result in invalid trees
 Sorting Intervals and numbers in the same list gathered all the numbers at the beginning and the Intervals at the end
 IntervalTree.overlaps() and friends returned None instead of False
 Maintainers: make installtestpypi failed because the pip was missing a pre flag
Version 1.1.1
 Removed requirement for pyandoc in order to run functionality tests.
Version 1.1.0
 Added ability to use Interval.distance_to() with points, not just Intervals
 Added documentation on return types to IntervalTree and Interval
 Interval.__cmp__() works with points too
 Fix: IntervalTree.score() returned maximum score of 0.5 instead of 1.0. Now returns max of subscores instead of avg
 Internal improvements:
 Development version numbering scheme, based on git describe the “building towards” release is appended after a hyphen, eg. 1.0.237g2da2ef01.10. The previous tagged release is 1.0.2, and there have been 37 commits since then, current tag is g2da2ef0, and we are getting ready for a 1.1.0 release
 Optimality tests added
 Interval overlap tests for ranges, Intervals and points added
Version 1.0.2
Version 1.0.1
 Fix: pip install failure because of failure to generate README.rst
Version 1.0.0
 Renamed from PyIntervalTree to intervaltree
 Speed improvements for adding and removing Intervals (~70% faster than 0.4)
 Bug fixes:
 BACKWARD INCOMPATIBLE: len() of an Interval is always 3, reverting to default behavior for namedtuples. In Python 3, len returning a noninteger raises an exception. Instead, use Interval.length(), which returns 0 for null intervals and end  begin otherwise. Also, if the len() === 0, then not iv is True.
 When inserting an Interval via __setitem__ and improper parameters given, all errors were transformed to IndexError
 split_overlaps did not update the boundary_table counts
 Internal improvements:
 More robust local testing tools
 Long series of interdependent tests have been separated into sections
Version 0.4
 Faster balancing (~80% faster)
 Bug fixes:
 Double rotations were performed in place of a single rotation when presented an unbalanced Node with a balanced child.
 During single rotation, kept referencing an unrotated Node instead of the new, rotated one
Version 0.3.3
 Made IntervalTree crash if inited with a null Interval (end <= begin)
 IntervalTree raises ValueError instead of AssertionError when a null Interval is inserted
Version 0.3.2
 Support for Python 3.2+ and 2.6+
 Changed license from LGPL to more permissive Apache license
 Merged changes from https://github.com/konstantint/PyIntervalTree to
https://github.com/chaimleib/PyIntervalTree
 Interval now inherits from a namedtuple. Benefits: should be faster. Drawbacks: slight behavioural change (Intervals not mutable anymore).
 Added float tests
 Use setup.py for tests
 Automatic testing via travisci
 Removed dependency on six
 Interval improvements:
 Intervals without data have a cleaner string representation
 Intervals without data are pickled more compactly
 Better hashing
 Intervals are ordered by begin, then end, then by data. If data is not orderable, sorts by type(data)
 Bug fixes:
 Fixed crash when querying empty tree
 Fixed missing close parenthesis in examples
 Made IntervalTree crash earlier if a null Interval is added
 Internals:
 New test directory
 Nicer display of data structures for debugging, using custom test/pprint.py (Python 2.6, 2.7)
 More sensitive exception handling
 Local script to test in all supported versions of Python
 Added IntervalTree.score() to measure how optimally a tree is structured
Version 0.2.3
 Slight changes for inclusion in PyPI.
 Some documentation changes
 Added tests
 Bug fix: interval addition via [] was broken in Python 2.7 (see http://bugs.python.org/issue21785)
 Added intervaltree.bio subpackage, adding some utilities for use in bioinformatics
Version 0.2.2b
 Forked from https://github.com/MusashiAharon/PyIntervalTree
File  Type  Py Version  Uploaded on  Size  

intervaltree2.1.0.tar.gz (md5)  Source  20150507  37KB  
 Author: ChaimLeib Halbert, Konstantin Tretyakov
 Home Page: https://github.com/chaimleib/intervaltree
 Download URL: https://github.com/chaimleib/intervaltree/tarball/2.1.0
 Keywords: intervaltree datastructure intervals tree
 License: Apache License, Version 2.0

Categories
 Development Status :: 4  Beta
 Intended Audience :: Developers
 Intended Audience :: Information Technology
 Intended Audience :: Science/Research
 License :: OSI Approved :: Apache Software License
 Programming Language :: Python
 Programming Language :: Python :: 2
 Programming Language :: Python :: 2.6
 Programming Language :: Python :: 2.7
 Programming Language :: Python :: 3
 Programming Language :: Python :: 3.2
 Programming Language :: Python :: 3.3
 Programming Language :: Python :: 3.4
 Topic :: Scientific/Engineering :: Artificial Intelligence
 Topic :: Scientific/Engineering :: BioInformatics
 Topic :: Scientific/Engineering :: Information Analysis
 Topic :: Software Development :: Libraries
 Topic :: Text Processing :: General
 Topic :: Text Processing :: Linguistic
 Topic :: Text Processing :: Markup
 Package Index Owner: chaimleib
 DOAP record: intervaltree2.1.0.xml