DecisionTree

A Python module for constructing a decision tree from multidimensional training data and for using the decision tree for classifying new data

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

**Version 2.2.3** fixes a bug that was caused by the
explicitly-set zero values for numerical features being
misconstrued as false in the conditional statements in
some of the method definitions.

**Version 2.2.2** includes scripts in the Examples directory
that demonstrate how to carry out bulk classification of all
your test data records placed in a CSV file in one fell
swoop. Also included are scripts that demonstrate the same
for the data records placed in the old-style .dat files.
The main module code remains unchanged.

**Version 2.0** was a major rewrite of the DecisionTree
module. This revision was prompted by a number of users
wanting to see numeric features incorporated in the
construction of decision trees. So here it is! This
version allows you to use either purely symbolic features,
or purely numeric features, or a mixture of the two. (A
feature is numeric if it can take any floating-point value
over an interval.)

With regard to the purpose of the module, assuming you have
placed your training data in a CSV file, all you have to do
is to supply the name of the file to this module and it does
the rest for you without much effort on your part for
classifying a new data sample. A decision tree classifier
consists of feature tests that are arranged in the form of a
tree. The feature test associated with the root node is one
that can be expected to maximally disambiguate the different
possible class labels for a new data record. From the root
node hangs a child node for each possible outcome of the
feature test at the root. This maximal class-label
disambiguation rule is applied at the child nodes
recursively until you reach the leaf nodes. A leaf node may
correspond either to the maximum depth desired for the
decision tree or to the case when there is nothing further
to gain by a feature test at the node.

Typical usage syntax:

::
training_datafile = "stage3cancer.csv"

dt = DecisionTree.DecisionTree(
training_datafile = training_datafile,
csv_class_column_index = 2,
csv_columns_for_features = [3,4,5,6,7,8],
entropy_threshold = 0.01,
max_depth_desired = 8,
symbolic_to_numeric_cardinality_threshold = 10,
)
dt.get_training_data()
dt.calculate_first_order_probabilities()
dt.calculate_class_priors()
dt.show_training_data()

root_node = dt.construct_decision_tree_classifier()
root_node.display_decision_tree(" ")

test_sample = ['g2 = 4.2',
'grade = 2.3',
'gleason = 4',
'eet = 1.7',
'age = 55.0',
'ploidy = diploid']

classification = dt.classify(root_node, test_sample)
print "Classification: ", classification

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

3.4.3

May 14, 2016

3.4.2

May 2, 2016

3.4.1

May 1, 2016

3.4.0

Apr 4, 2016

3.3.2

Feb 13, 2016

3.3.1

Jan 31, 2016

3.3.0

Jan 26, 2016

3.2.4

Nov 23, 2015

3.2.3

Oct 26, 2015

3.2.2

Oct 25, 2015

3.2.1

Jun 14, 2015

3.2.0

Jun 10, 2015

3.0.1

May 28, 2015

3.0

May 18, 2015

2.3.4

May 5, 2015

2.3.3

Mar 27, 2015

2.3.2

Mar 22, 2015

2.3.1

Mar 17, 2015

2.3

Mar 16, 2015

2.2.6

Mar 11, 2015

2.2.5

Nov 25, 2014

2.2.4

Jun 18, 2014

This version

2.2.3

Jun 13, 2014

2.2.2

May 3, 2014

2.2.1

Sep 5, 2013

2.2

Sep 2, 2013

2.1

Aug 17, 2013

2.0

Jun 19, 2013

1.7.1

May 22, 2013

1.7

Jul 29, 2012

1.6.1

Jun 22, 2012

1.6

Jun 20, 2012

1.5

May 16, 2011

1.0

May 16, 2011

DecisionTree 2.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed