treeswift package

TreeSwift is a Python library for parsing, manipulating, and iterating over (rooted) tree structures. TreeSwift places an emphasis on speed.

Module contents

class treeswift.Node(label=None, edge_length=None)[source]

Bases: object

Node class

add_child(child)[source]

Add child to Node object

Args:

child (Node): The child Node to be added

child_nodes()[source]

Return a list containing this Node object’s children

Returns:

list: A list containing this Node object’s children

contract()[source]

Contract this Node by directly connecting its children to its parent

get_edge_length()[source]

Return the length of the edge incident to this Node

Returns:

float: The length of the edge incident to this Node

get_label()[source]

Return the label of this Node

Returns:

object: The label of this Node

get_parent()[source]

Return the parent of this Node

Returns:

Node: The parent of this Node

is_leaf()[source]

Returns True if this is a leaf

Returns:

bool: True if this is a leaf, otherwise False

is_root()[source]

Returns True if this is the root

Returns:

bool: True if this is the root, otherwise False

newick()[source]

Newick string conversion starting at this Node object

Returns:

str: Newick string conversion starting at this Node object

num_children()[source]

Returns the number of children of this Node

Returns:

int: The number of children of this Node

num_nodes(leaves=True, internal=True)[source]

Compute the total number of selected nodes in the subtree rooted by this Node (including itself)

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

Returns:

int: The total number of selected nodes in this Tree

remove_child(child)[source]

Remove child from Node object

Args:

child (Node): The child to remove

resolve_polytomies()[source]

Arbitrarily resolve polytomies below this Node with 0-lengthed edges.

set_edge_length(length)[source]

Set the length of the edge incident to this Node

Args:

length: The new length of the edge incident to this Node

set_label(label)[source]

Set the label of this Node object

Args:

label: The new label

set_parent(parent)[source]

Set the parent of this Node object. Use this carefully, otherwise you may damage the structure of this Tree object.

Args:

Node: The new parent of this Node

traverse_ancestors(include_self=True)[source]

Traverse over the ancestors of this Node

Args:

include_self (bool): True to include self in the traversal, otherwise False

traverse_bfs(include_self=True)[source]

Perform a Breadth-First Search (BFS) starting at this Node object’. Yields (Node, distance) tuples

Args:

include_self (bool): True to include self in the traversal, otherwise False

traverse_inorder(leaves=True, internal=True)[source]

Perform an inorder traversal starting at this Node object

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

traverse_internal()[source]

Traverse over the internal nodes below (and including) this Node object

traverse_leaves()[source]

Traverse over the leaves below this Node object

traverse_levelorder(leaves=True, internal=True)[source]

Perform a levelorder traversal starting at this Node object

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

traverse_postorder(leaves=True, internal=True)[source]

Perform a postorder traversal starting at this Node object

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

traverse_preorder(leaves=True, internal=True)[source]

Perform a preorder traversal starting at this Node object

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

traverse_rootdistorder(ascending=True, leaves=True, internal=True)[source]

Perform a traversal of the Node objects in the subtree rooted at this Node in either ascending (ascending=True) or descending (ascending=False) order of distance from this Node

Args:

ascending (bool): True to perform traversal in ascending distance from the root, otherwise False for descending

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

class treeswift.Tree(is_rooted=True)[source]

Bases: object

Tree class

avg_branch_length(terminal=True, internal=True)[source]

Compute the average length of the selected branches of this Tree. Edges with length None will be treated as 0-length

Args:

terminal (bool): True to include terminal branches, otherwise False

internal (bool): True to include internal branches, otherwise False

Returns:

The average length of the selected branches

branch_lengths(terminal=True, internal=True)[source]

Generator over the lengths of the selected branches of this Tree. Edges with length None will be output as 0-length

Args:

terminal (bool): True to include terminal branches, otherwise False

internal (bool): True to include internal branches, otherwise False

closest_leaf_to_root()[source]

Return the leaf that is closest to the root and the corresponding distance. Edges with no length will be considered to have a length of 0

Returns:

tuple: First value is the closest leaf to the root, and second value is the corresponding distance

coalescence_times(backward=True)[source]

Generator over the times of successive coalescence events

Args:

backward (bool): True to go backward in time (i.e., leaves to root), otherwise False

coalescence_waiting_times(backward=True)[source]

Generator over the waiting times of successive coalescence events

Args:

backward (bool): True to go backward in time (i.e., leaves to root), otherwise False

collapse_short_branches(threshold)[source]

Collapse internal branches (not terminal branches) with length less than or equal to threshold. A branch length of None is considered 0

Args:

threshold (float): The threshold to use when collapsing branches

colless(normalize='leaves')[source]

Compute the Colless balance index of this Tree. If the tree has polytomies, they will be randomly resolved

Args:

normalize (str): How to normalize the Colless index (if at all)

  • None to not normalize

  • "leaves" to normalize by the number of leaves

  • "yule" to normalize to the Yule model

  • "pda" to normalize to the Proportional to Distinguishable Arrangements model

Returns:

float: Colless index (either normalized or not)

condense()[source]

If siblings have the same label, merge them. If they have edge lengths, the resulting Node will have the larger of the lengths

contract_low_support(threshold, terminal=False, internal=True)[source]

Contract nodes labeled by a number (e.g. branch support) below threshold

Args:

threshold (float): The support threshold to use when contracting nodes

terminal (bool): True to include terminal branches, otherwise False

internal (bool): True to include internal branches, otherwise False

deroot()[source]

If tree bifurcates at the root, contract an edge incident with the root to create a trifurcation at the root.

diameter()[source]

Compute the diameter (maximum leaf pairwise distance) of this Tree

Returns:

float: The diameter of this Tree

distance_between(u, v)[source]

Return the distance between nodes u and v in this Tree

Args:

u (Node): Node u

v (Node): Node v

Returns:

float: The distance between nodes u and v

distance_matrix(leaf_labels=False)[source]

Return a distance matrix (2D dictionary) of the leaves of this Tree

Args:

leaf_labels (bool): True to have keys be labels of leaf Node objects, otherwise False to have keys be Node objects

Returns:

dict: Distance matrix (2D dictionary) of the leaves of this Tree, where keys are labels of leaves; M[u][v] = distance from u to v

distances_from_parent(leaves=True, internal=True, unlabeled=False)[source]

Generator over the node-to-parent distances of this Tree; (node,distance) tuples

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

unlabeled (bool): True to include unlabeled nodes, otherwise False

distances_from_root(leaves=True, internal=True, unlabeled=False, weighted=True)[source]

Generator over the root-to-node distances of this Tree; (node,distance) tuples

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

unlabeled (bool): True to include unlabeled nodes, otherwise False

weighted (bool): True to define distance as sum of edge lengths (i.e., weighted distance), or False to define distance as total number of edges (i.e., unweighted distance). If unweighted, edges with length None are counted in the height

draw(show_plot=True, export_filename=None, show_labels=False, align_labels=False, label_fontsize=8, start_time=0, default_color='#000000', xlabel=None, handles=None)[source]

Draw this Tree

Args:

show_plot (bool): True to show the plot, otherwise False

export_filename (str): File to which the tree figure will be exported (otherwise None to not save to file)

show_labels (bool): True to show the leaf labels, otherwise False

align_labels (bool): True to align the leaf labels (if shown), otherwise False to just put them by their tips

label_fontsize (int): Font size of the leaf labels (in points). 8pt = 1/9in –> 1in = 72pt

default_color (str): The default color to use if a node doesn’t have a color attribute

xlabel (str): The label of the horizontal axis in the resulting plot

handles (list): List of matplotlib Patch objects for a legend

drop_edge_length_at_root(label='OLDROOT')[source]

If the tree has a root edge, drop the edge to be a child of the root node

Args:

label (str): The desired label of the new child

edge_length_sum(terminal=True, internal=True)[source]

Compute the sum of all selected edge lengths in this Tree

Args:

terminal (bool): True to include terminal branches, otherwise False

internal (bool): True to include internal branches, otherwise False

Returns:

float: Sum of all selected edge lengths in this Tree

extract_subtree(node)[source]

Return a copy of the subtree rooted at node

Args:

node (Node): The root of the desired subtree

Returns:

Tree: A copy of the subtree rooted at node

extract_tree(labels, without, suppress_unifurcations=True)[source]

Helper function for extract_tree_* functions

extract_tree_with(labels, suppress_unifurcations=True)[source]

Extract a copy of this Tree with only the leaves labeled by the strings in labels

Args:

labels (set): Set of leaf labels to include.

suppress_unifurcations (bool): True to suppress unifurcations, otherwise False

Returns:

Tree: Copy of this Tree, including only the leaves labeled by the strings in labels

extract_tree_without(labels, suppress_unifurcations=True)[source]

Extract a copy of this Tree without the leaves labeled by the strings in labels

Args:

labels (set): Set of leaf labels to exclude

suppress_unifurcations (bool): True to suppress unifurcations, otherwise False

Returns:

Tree: Copy of this Tree, exluding the leaves labeled by the strings in labels

find_node(label, leaves=True, internal=False)[source]

Find and return the node(s) labeled by label, or None if none exist. Note that this function performs a linear-time search, so if you will be call

Args:

label (str): The label to search for

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

Returns:

The Node object labeled by label (or a list of Node objects if multiple are labeled by label), or None if none exist

furthest_from_root()[source]

Return the Node that is furthest from the root and the corresponding distance. Edges with no length will be considered to have a length of 0

Returns:

tuple: First value is the furthest Node from the root, and second value is the corresponding distance

gamma_statistic()[source]

Compute the Gamma statistic of Pybus and Harvey (2000)

Returns:

float: The Gamma statistic of Pybus and Harvey (2000)

height(weighted=True)[source]

Compute the height (i.e., maximum distance from root) of this Tree

Returns:

float: The height (i.e., maximum distance from root) of this Tree

indent(space=4)[source]

Return an indented Newick string, just like nw_indent in Newick Utilities

Args:

space (int): The number of spaces a tab should equal

Returns:

str: An indented Newick string

label_to_node(selection='leaves')[source]

Return a dictionary mapping labels (strings) to Node objects

  • If selection is "all", the dictionary will contain all nodes

  • If selection is "leaves", the dictionary will only contain leaves

  • If selection is "internal", the dictionary will only contain internal nodes

  • If selection is a set, the dictionary will contain all nodes labeled by a label in selection

  • If multiple nodes are labeled by a given label, only the last (preorder traversal) will be obtained

Args:

selection (str or set): The selection of nodes to get

  • "all" to select all nodes

  • "leaves" to select leaves

  • "internal" to select internal nodes

  • A set of labels to specify nodes to select

Returns:

dict: Dictionary mapping labels to the corresponding nodes

labels(leaves=True, internal=True)[source]

Generator over the (non-None) Node labels of this Tree

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

ladderize(ascending=True)[source]

Ladderize this Tree by sorting each Node’s children by total number of descendants

Args:

ascending (bool): True to sort in ascending order of mode, otherwise False

lineages_through_time(present_day=None, show_plot=True, export_filename=None, color='#000000', xmin=None, xmax=None, ymin=None, ymax=None, title=None, xlabel=None, ylabel=None)[source]

Compute the number of lineages through time. If seaborn is installed, a plot is shown as well

Args:

present_day (float): The time of the furthest node from the root. If None, the top of the tree will be placed at time 0

show_plot (bool): True to show the plot, otherwise False to only return the dictionary. To plot multiple LTTs on the same figure, set show_plot to False for all but the last plot

export_filename (str): File to which the LTT figure will be exported (otherwise None to not save to file)

color (str): The color of the resulting plot

title (str): The title of the resulting plot

xmin (float): The minimum value of the horizontal axis in the resulting plot

xmax (float): The maximum value of the horizontal axis in the resulting plot

xlabel (str): The label of the horizontal axis in the resulting plot

ymin (float): The minimum value of the vertical axis in the resulting plot

ymax (float): The maximum value of the vertical axis in the resulting plot

ylabel (str): The label of the vertical axis in the resulting plot

Returns:

dict: A dictionary in which each (t,n) pair denotes the number of lineages n that existed at time t

ltt(present_day=None, show_plot=True, export_filename=None, color='#000000', xmin=None, xmax=None, ymin=None, ymax=None, title=None, xlabel=None, ylabel=None)

Compute the number of lineages through time. If seaborn is installed, a plot is shown as well

Args:

present_day (float): The time of the furthest node from the root. If None, the top of the tree will be placed at time 0

show_plot (bool): True to show the plot, otherwise False to only return the dictionary. To plot multiple LTTs on the same figure, set show_plot to False for all but the last plot

export_filename (str): File to which the LTT figure will be exported (otherwise None to not save to file)

color (str): The color of the resulting plot

title (str): The title of the resulting plot

xmin (float): The minimum value of the horizontal axis in the resulting plot

xmax (float): The maximum value of the horizontal axis in the resulting plot

xlabel (str): The label of the horizontal axis in the resulting plot

ymin (float): The minimum value of the vertical axis in the resulting plot

ymax (float): The maximum value of the vertical axis in the resulting plot

ylabel (str): The label of the vertical axis in the resulting plot

Returns:

dict: A dictionary in which each (t,n) pair denotes the number of lineages n that existed at time t

mrca(labels)[source]

Return the Node that is the MRCA of the nodes labeled by a label in labels. If multiple nodes are labeled by a given label, only the last (preorder traversal) will be obtained

Args:

labels (set): Set of leaf labels

Returns:

Node: The MRCA of the Node objects labeled by a label in labels

mrca_matrix()[source]

Return a dictionary storing all pairwise MRCAs. M[u][v] = MRCA of nodes u and v. Excludes M[u][u] because MRCA of node and itself is itself

Returns:

dict: M[u][v] = MRCA of nodes u and v

newick()[source]

Output this Tree as a Newick string

Returns:

str: Newick string of this Tree

num_cherries()[source]

Returns the number of cherries (i.e., internal nodes that only have leaf children) in this Tree

Returns:

int: The number of cherries in this Tree

num_lineages_at(distance)[source]

Returns the number of lineages of this Tree that exist distance away from the root

Args:

distance (float): The distance away from the root

Returns:

int: The number of lineages that exist distance away from the root

num_nodes(leaves=True, internal=True)[source]

Compute the total number of selected nodes in this Tree

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

Returns:

int: The total number of selected nodes in this Tree

order(mode, ascending=True)[source]

Order the children of the nodes in this Tree based on mode

Args:

mode (str): How to order the children of the nodes of this Tree

  • "edge_length" = order by incident edge length

  • "edge_length_then_label" = order by incident edge length, then by node label

  • "edge_length_then_label_then_num_descendants" = order by incident edge length, then by node label, then by number of descendants

  • "edge_length_then_num_descendants" = order by incident edge length, then by number of descendants

  • "edge_length_then_num_descendants_then_label" = order by incident edge length, then by number of descendants, then by node label

  • "label" = order by node label

  • "label_then_edge_length" = order by node label, then by incident edge length

  • "label_then_edge_length_then_num_descendants" = order by node label, then by incident edge length, then by number of descendants

  • "label_then_num_descendants" = order by node label, then by number of descendants

  • "label_then_num_descendants_then_edge_length" = order by node label, then by number of descendants, then by incident edge length

  • "num_descendants" = order by number of descendants

  • "num_descendants_then_label" = order by number of descendants, then by node label

  • "num_descendants_then_label_then_edge_length" = order by number of descendants, then by node label, then by incident edge length

  • "num_descendants_then_edge_length" = order by number of descendants, then by incident edge length

  • "num_descendants_then_edge_length_then_label" = order by number of descendants, then by incident edge length, then by node label

ascending (bool): True to sort in ascending order of mode, otherwise False

rename_nodes(renaming_map)[source]

Rename nodes in this Tree

Args:

renaming_map (dict): A dictionary mapping old labels (keys) to new labels (values)

reroot(node, length=None, branch_support=False)[source]

Reroot this Tree at length up the incident edge of node. If 0 or None, reroot at the node (not on the incident edge)

Args:

node (Node): The Node on whose incident edge this Tree will be rerooted

length (float): The distance up the specified edge at which to reroot this Tree. If 0 or None, reroot at the node (not on the incident edge)

branch_support (bool): True if internal node labels represent branch support values, otherwise False

resolve_polytomies()[source]

Arbitrarily resolve polytomies with 0-lengthed edges.

sackin(normalize='leaves')[source]

Compute the Sackin balance index of this Tree

Args:

normalize (str): How to normalize the Sackin index (if at all)

  • None to not normalize

  • "leaves" to normalize by the number of leaves

  • "yule" to normalize to the Yule model

  • "pda" to normalize to the Proportional to Distinguishable Arrangements model

Returns:

float: Sackin index (either normalized or not)

scale_edges(multiplier)[source]

Multiply all edges in this Tree by multiplier

suppress_unifurcations()[source]

Remove all nodes with only one child and directly attach child to parent

traverse_inorder(leaves=True, internal=True)[source]

Perform an inorder traversal of the Node objects in this Tree

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

traverse_internal()[source]

Traverse over the internal nodes of this Tree

traverse_leaves()[source]

Traverse over the leaves of this Tree

traverse_levelorder(leaves=True, internal=True)[source]

Perform a levelorder traversal of the Node objects in this Tree

traverse_postorder(leaves=True, internal=True)[source]

Perform a postorder traversal of the Node objects in this Tree

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

traverse_preorder(leaves=True, internal=True)[source]

Perform a preorder traversal of the Node objects in this Tree

Args:

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

traverse_rootdistorder(ascending=True, leaves=True, internal=True)[source]

Perform a traversal of the Node objects in this Tree in either ascending (ascending=True) or descending (ascending=False) order of distance from the root

Args:

ascending (bool): True to perform traversal in ascending distance from the root, otherwise False for descending

leaves (bool): True to include leaves, otherwise False

internal (bool): True to include internal nodes, otherwise False

treeness()[source]

Compute the treeness (sum of internal branch lengths / sum of all branch lengths) of this Tree. Branch lengths of None are considered 0 length

Returns:

float: Treeness of this Tree (sum of internal branch lengths / sum of all branch lengths)

write_tree_newick(filename, hide_rooted_prefix=False)[source]

Write this Tree to a Newick file

Args:

filename (str): Path to desired output file (plain-text or gzipped)

hide_rooted_prefix (bool): Hide the rooted prefix [&R] if rooted tree

write_tree_nexus(filename)[source]

Write this Tree to a Nexus file

Args:

filename (str): Path to desired output file (plain-text or gzipped)

treeswift.read_tree(input, schema)[source]

Read a tree from a string or file

Args:

input (str): Either a tree string, a path to a tree file (plain-text or gzipped), or a DendroPy Tree object

schema (str): The schema of input (DendroPy, Newick, NeXML, Nexus, or linkage)

Returns:
  • If the input is Newick, either a Tree object if input contains a single tree, or a list of Tree objects if input contains multiple trees (one per line)

  • If the input is NeXML or Nexus, a dict of trees represented by input, where keys are tree names (str) and values are Tree objects

treeswift.read_tree_dendropy(tree)[source]

Create a TreeSwift tree from a DendroPy tree

Args:

tree (dendropy.datamodel.treemodel): A Dendropy Tree object

Returns:

Tree: A TreeSwift tree created from tree

treeswift.read_tree_newick(newick)[source]

Read a tree from a Newick string or file

Args:

newick (str): Either a Newick string or the path to a Newick file (plain-text or gzipped)

Returns:

Tree: The tree represented by newick. If the Newick file has multiple trees (one per line), a list of Tree objects will be returned

treeswift.read_tree_nexml(nexml)[source]

Read a tree from a NeXML string or file

Args:

nexml (str): Either a NeXML string or the path to a NeXML file (plain-text or gzipped)

Returns:

dict of Tree: A dictionary of the trees represented by nexml, where keys are tree names (str) and values are Tree objects

treeswift.read_tree_nexus(nexus, translate=True)[source]

Read a tree from a Nexus string or file

Args:

nexus (str): Either a Nexus string or the path to a Nexus file (plain-text or gzipped)

translate (bool): Translate the node labels on the trees (if the Nexus file has a “Translate” section)

Returns:

dict of Tree: A dictionary of the trees represented by nexus, where keys are tree names (str) and values are Tree objects

If the Nexus file had a “Taxlabels” section, the taxon labels will be stored in the output dictionary as a list associated with key "taxlabels"

If any trees in the Nexus file had information (e.g. the first ... in the tree STATE_0 [...] = [&R] (...); line), all information will be stored in the output dictionary: they will be associated with key "info" in the output dictionary, and they will be stored in a dictionary where keys are tree names (str)

If translate was True, if a node x was translated, x.label will be the translated label, and x.id will be the original label (the “ID”)