Build Phylogenetic Tree from MSA using FastTree2 - v2.1.11

Build a phylogenetic reconstruction from a Multiple Sequence Alignment (MSA) using FastTree2.

This App reconstructs a phylogenetic tree from a Multiple Sequence Alignment (MSA) of either nucleotide or protein sequences using FastTree2. FastTree2 can be used to determine evolutionary relationships among aligned sequences. FastTree2 will calculate the distances between proteins in the alignment and build an approximately maximum-likelihood tree. The tree is displayed using ETE3 (v3.1.2).

We recommend that users to review the Build Gene Tree Tutorial to understand the upstream processes required to use this App.

FastTree2 takes a precomputed MSA and, following an evolutionary model for the distance between aligned positions (e.g. the Jones-Taylor-Thornton JTT model), determines the distances between sequences and infers an approximately Maximum Likelihood tree for those distances. FastTree2 is much faster than many methods of comparable quality. The output is a newick formatted tree, which KBase displays using the ETE3 toolkit. A KBase Tree object is generated and stored in the Narrative. The newick file and tree images are available for download. Nucleotide or Protein sequence MSAs may be used, and the method is agnostic to whether it is a GeneTree or a SpeciesTree (but tree type must be indicated to set for the output Tree object).

Tool Source:

FastTree v2.1.11 is installed from http://www.microbesonline.org/fasttree/.

Configuration:

Tree Description: This is used in the output figure and carried in the Tree object.

Input MSA: The MSA from which to generate the tree. You must pre-concatenate MSAs if you wish to make a SpeciesTree from concatenated phylogenetic marker MSAs.

Output Tree: The name of the output Tree object.

Species Tree?: Indicate whether or not a SpeciesTree is being computed. This is so downstream Apps read the output Tree object correctly. A SpeciesTree can be built from either a 16S nucleotide alignment, concatenated protein markers, or any other MSA that contains what you consider a phylogenetically informative set of sequences.

Starting Tree: You may initialize the tree building with a selected topology. This must be a KBase Tree object.

Fastest?: FastTree2 takes O(L*a*N + N^1.5) space and O(L*a*log(N)*N^1.5) time, where N is the number of unique sequences, L is the width of the alignment, and a is the size of the alphabet. With -fastest, the theoretical space reduces to O(L*a*N + N^1.25) space and the time reduces to O(L*a*N^1.25).

Pseudo Count?: Missing regions of the alignment should be inferred using pseudocounts, so if you have many fragmentary sequences, use this option.

GTR?: Use a generalized time-reversible model (for nucleotide alignments only).

WAG?: Use the Whelan-And-Goldman 2001 model (amino acid alignments only).

No ML?: Turn off maximum-likelihood.

No ME?: Turn off minimum-evolution NNIs (nearest-neighbor interchanges) and SPRs (subtree-prune-regraft moves).

Num Rate Categories (CAT): Number of rate categories of sites (default is 20). This allows modeling of non-uniform evolutionary rates across sequences.

No Cat?: Use Constant Rates (instead of above Num Rate Categories). This is if you expect uniform evolutionary rates along sequences.

Gamma: After optimizing the tree under the CAT approximation, rescale the lengths to optimize the Gamma20 likelihood.

App Output:

Output Object: A KBase Tree object is generated.

Output Tree Image: The Tree is rendered using the ETE3 Toolkit.

Downloadable files: The Newick formatted output tree, as well as rendered PNG and PDF formats, are available for download.

Team members who implemented App in KBase: Dylan Chivian. For questions, please contact us.

Please cite:

Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLOS ONE. 2010;5: e9490. doi:10.1371/journal.pone.0009490

Related Publications

Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLOS ONE. 2010;5: e9490. doi:10.1371/journal.pone.0009490 , https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0009490
Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009;26: 1641 1650. doi:10.1093/molbev/msp077 , https://www.ncbi.nlm.nih.gov/pubmed/19377059
Huerta-Cepas J, Serra F, Bork P. ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data. Mol Biol Evol. 2016;33: 1635 1638. doi:10.1093/molbev/msw046 , https://www.ncbi.nlm.nih.gov/pubmed/26921390
FastTree-2 source: , http://www.microbesonline.org/fasttree/
ETE3 source: , http://etetoolkit.org

App Specification:

https://github.com/kbaseapps/kb_fasttree/tree/7d655295bda5546436ce74ebcc7824b4abf369e5/ui/narrative/methods/run_FastTree

Module Commit: 7d655295bda5546436ce74ebcc7824b4abf369e5