Skip to main content

TANOS: TAxon jackknife for NOdal Stability. In phylogenetic trees, calculate how resilient nodes are to the removal of taxa

Project description

TANOS: TAxon jackknife for NOdal Stability

Table of Contents

I. Introduction
II. Installation Instructions
III. Usage Instructions
IV. License
V. Contributors
VI. Citation
VII. Contact

I. Introduction

In phylogenetic trees, assessing the stability of a node is paramount to having confidence in the quality of the topology. Several strategies exist to evaluate a node, with bootstrap support being particularly popular. Most methods rely on how consistently characters from the input data matrix are in a given state to judge the node. We seek to evaluate the stability of the node by determining how resilient nodes are to the removal of taxa. This package can calculate the taxa resiliency for a provided tree. To determine this, one must remove a taxon from the alignment (jackknife sampling) and build a new tree, comparing it to the original tree. To avoid anomalies, a new tree must be built multiple times (say 50). This process must be repeated for each taxon. Thus, if 100 taxa are in a given tree and 50 replicates are performed for the removal of each taxon then 5,000 new trees are built. These new trees can be provided with the original tree to our package for comparison to determine how frequently different clades appear. The intent is to assign a value to each node, similar to bootstrapping. A value of 1 indicates 100% of sampled trees includes the grouping in that clade (sans the one taxon intentionally removed). Valid values fit in the range [0,1]. Note that this value has no meaning for leaf nodes, parents of leaf nodes, and the root node.

Please note that this package assumes the following has already been performed:

  1. The original alignment of all taxa
  2. A tree built with all taxa
  3. "jackknifing" the alignment (making one copy per taxon in the tree, minus that taxon)
  4. New trees built using the "jackknifed" alignments with a pre-determined level of replication

Assuming the interested user of this package wishes to generate nodal stability scores for a tree, steps #1-2 will already be completed. Step #3 is trivially accomplished with basic scripting. See the scripts directory of the repository for an example. Step #4 is conceptually simple, but more difficult to manage as the number of taxa and/or replicates increases. An example script to manage such a task on a computing cluster under SLURM control using IQ-TREE is also provided. Note that these are simply example scripts, not fully-tested programs; modifications will be required for your system and situation, especially if you intend to use an alternate directory structure, tree inference software, or grid control engine. Please note that if you installed tanos with pip, you will not see these scripts as they are not included in the PyPI manifest; you will need to download the repo from GitHub to have access to them. Please see the README file in the scripts directory for additional details.

This program must have access to all the relevant trees (i.e., the original/main tree and the "jackknifed" trees) and a mapping of which trees are associated with which taxa. A default directory structure and file naming scheme is assumed. These assumptions can be modified to some degree, or bypassed entirely, with certain command-line arguments. Please consult the section III. Usage Instructions for details.

We are in the process of publishing a paper describing this work. Once available, we will provide relevant details here:

See our paper in journal for further information:
        https://sub-domain.domain.tld/some/path/to/resource

II. Installation Instructions

This package is written in Python. You must have a version of Python (v3.6+) that supports f-strings. This package also depends on the following Python modules: sys, re, pkgutil, argparse, and pathlib, which are all included in the Python Standard Library. Installation may be accomplished using pip like this:

        pip install tanos

Alternately, you may install it manually using setuptools like this:

        python setup.py install

If running setup.py, you may wish to add --user or --prefix for an installation in non-default locations.

III. Usage Instructions

Please run tanos -h or tanos --help.

IV. License

Please see the LICENSE.

V. Contributors

Brandon Pickett (Github: @pickettbd) is the sole author of this software.

VI. Citation

Until a version of the paper is available, we ask that you cite the Github repository. Thereafter, you will be able to include a citation following this general form:

Powell GS, Pickett BD, Martin GJ, Whiting MF, Ridge PG, Bybee SM. Paper title. journal. Year. Volume(Issue):pages.

VII. Contact

For questions, comments, concerns, etc., please open an issue on Github.

Note: Before asking for help, please ensure you (a) are using the latest official release and (b) consult first the section III. Usage Instructions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tanos-0.0.1b7.tar.gz (20.0 kB view hashes)

Uploaded Source

Built Distributions

tanos-0.0.1b7-py3.7.egg (15.9 kB view hashes)

Uploaded Source

tanos-0.0.1b7-py3-none-any.whl (17.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page