App Catalog
Sign Up Sign In
Build Phylogenetic Tree from MSA using FastTree2 - v2.1.9


By: dylan, psdehal


Build a phylogenetic reconstruction from a Multiple Sequence Alignment (MSA) using FastTree2.

This App reconstructs a phylogenetic tree from a Multiple Sequence Alignment (MSA) of either nucleotide or protein sequences using FastTree2. FastTree2 can be used to determine evolutionary relationships among aligned sequences. FastTree2 will calculate the distances between proteins in the alignment and build an approximately maximum-likelihood tree. The tree is displayed using ETE3 (v3.0.0b35).

We recommend that users to review the Build Gene Tree Tutorial to understand the upstream processes required to use this App.

FastTree2 takes a precomputed MSA and, following an evolutionary model for the distance between aligned positions (e.g. the Jones-Taylor-Thornton JTT model), determines the distances between sequences and infers an approximately Maximum Likelihood tree for those distances. FastTree2 is much faster than many methods of comparable quality. The output is a newick formatted tree, which KBase displays using the ETE3 toolkit. A KBase Tree object is generated and stored in the Narrative. The newick file and tree images are available for download. Nucleotide or Protein sequence MSAs may be used, and the method is agnostic to whether it is a GeneTree or a SpeciesTree (but tree type must be indicated to set for the output Tree object).


Tree Description: This is used in the output figure and carried in the Tree object.

Input MSA: The MSA from which to generate the tree. You must pre-concatenate MSAs if you wish to make a SpeciesTree from concatenated phylogenetic marker MSAs.

Output Tree: The name of the output Tree object.

Species Tree?: Indicate whether or not a SpeciesTree is being computed. This is so downstream Apps read the output Tree object correctly. A SpeciesTree can be built from either a 16S nucleotide alignment, concatenated protein markers, or any other MSA that contains what you consider a phylogenetically informative set of sequences.

Starting Tree: You may initialize the tree building with a selected topology. This must be a KBase Tree object.

Fastest?: FastTree2 takes O(L*a*N + N^1.5) space and O(L*a*log(N)*N^1.5) time, where N is the number of unique sequences, L is the width of the alignment, and a is the size of the alphabet. With -fastest, the theoretical space reduces to O(L*a*N + N^1.25) space and the time reduces to O(L*a*N^1.25).

Pseudo Count?: Missing regions of the alignment should be inferred using pseudocounts, so if you have many fragmentary sequences, use this option.

GTR?: Use a generalized time-reversible model (for nucleotide alignments only).

WAG?: Use the Whelan-And-Goldman 2001 model (amino acid alignments only).

No ML?: Turn off maximum-likelihood.

No ME?: Turn off minimum-evolution NNIs (nearest-neighbor interchanges) and SPRs (subtree-prune-regraft moves).

Num Rate Categories (CAT): Number of rate categories of sites (default is 20). This allows modeling of non-uniform evolutionary rates across sequences.

No Cat?: Use Constant Rates (instead of above Num Rate Categories). This is if you expect uniform evolutionary rates along sequences.

Gamma: After optimizing the tree under the CAT approximation, rescale the lengths to optimize the Gamma20 likelihood.

App Output:

Output Object: A KBase Tree object is generated.

Output Tree Image: The Tree is rendered using the ETE3 Toolkit.

Downloadable files: The Newick formatted output tree, as well as rendered PNG and PDF formats, are available for download.

FastTree2.1.9 source

Team members who implemented App in KBase: Dylan Chivian. For questions, please contact us.

Related Publications

App Specification:

Module Commit: b967ee863c008d6b131ffb70569e536dc863f127