KBase Narrative - Candidatus Roseilinea sp. NK

Draft genome of Candidatus Roseilinea sp. NK_OTU-006 recovered from metagenomic data of a hot spring microbial mat

Table of Contents

Data Import
Assess Quality of Assemblies
Assess Genome Quality using CheckM
Annotate Assemblies and Genomes
Compute ANI
Construct a Genome Tree
Genome Vizualization
Compute Pangenome
Build metabolic Model
View Genome Function Profiles

[1] Data Import

v1 - KBaseGenomeAnnotations.Assembly-5.0

The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

Ca.Roseil_genome

v1 - KBaseGenomes.Genome-11.0

The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

JP3_7_PGTN01.1.fsa_nt_assembly

v1 - KBaseGenomeAnnotations.Assembly-5.0

The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

JP3_7_genome

v1 - KBaseGenomes.Genome-11.0

The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

Ca_Roseilinea_gracile_MS_genome_assembly

v1 - KBaseGenomeAnnotations.Assembly-5.0

The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

Ca_Roseilinea_gracile_MS_genome

v1 - KBaseGenomes.Genome-11.0

The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

[2] Assess Quality of Assemblies

Assess Quality of Assemblies with QUAST - v4.4

Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.

This app completed without errors in 1m 26s.

Report

View report in separate window

Summary

All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly RoseilNK_assembly # contigs (>= 0 bp) 117 # contigs (>= 1000 bp) 117 # contigs (>= 10000 bp) 48 # contigs (>= 100000 bp) 9 # contigs (>= 1000000 bp) 0 Total length (>= 0 bp) 3642138 Total length (>= 1000 bp) 3642138 Total length (>= 10000 bp) 3396310 Total length (>= 100000 bp) 1677642 Total length (>= 1000000 bp) 0 # contigs 117 Largest contig 332664 Total length 3642138 GC (%) 63.39 N50 94936 N75 56899 L50 11 L75 24 # N's per 100 kbp 0.00 # predicted genes (unique) 3161 # predicted genes (>= 0 bp) 3126 + 39 part # predicted genes (>= 300 bp) 2836 + 36 part # predicted genes (>= 1500 bp) 506 + 6 part # predicted genes (>= 3000 bp) 59 + 0 part

Links

report.html

Assess Quality of Assemblies with QUAST - v4.4

Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.

This app completed without errors in 1m 49s.

Report

View report in separate window

Summary

All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly JP3_7_PGTN01.1.fsa_nt_assembly # contigs (>= 0 bp) 708 # contigs (>= 1000 bp) 251 # contigs (>= 10000 bp) 91 # contigs (>= 100000 bp) 0 # contigs (>= 1000000 bp) 0 Total length (>= 0 bp) 3379197 Total length (>= 1000 bp) 3067753 Total length (>= 10000 bp) 2638098 Total length (>= 100000 bp) 0 Total length (>= 1000000 bp) 0 # contigs 708 Largest contig 94883 Total length 3379197 GC (%) 63.83 N50 28546 N75 13093 L50 40 L75 82 # N's per 100 kbp 0.00 # predicted genes (unique) 3300 # predicted genes (>= 0 bp) 3022 + 278 part # predicted genes (>= 300 bp) 2716 + 253 part # predicted genes (>= 1500 bp) 397 + 11 part # predicted genes (>= 3000 bp) 33 + 0 part

Links

report.html

Assess Quality of Assemblies with QUAST - v4.4

Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.

This app completed without errors in 1m 34s.

Report

View report in separate window

Summary

All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly Ca_Roseilinea_gracile_MS_genome_assembly # contigs (>= 0 bp) 439 # contigs (>= 1000 bp) 439 # contigs (>= 10000 bp) 62 # contigs (>= 100000 bp) 0 # contigs (>= 1000000 bp) 0 Total length (>= 0 bp) 2635638 Total length (>= 1000 bp) 2635638 Total length (>= 10000 bp) 849391 Total length (>= 100000 bp) 0 Total length (>= 1000000 bp) 0 # contigs 439 Largest contig 37322 Total length 2635638 GC (%) 62.95 N50 7110 N75 4420 L50 117 L75 237 # N's per 100 kbp 74.59 # predicted genes (unique) 2480 # predicted genes (>= 0 bp) 2316 + 165 part # predicted genes (>= 300 bp) 2014 + 143 part # predicted genes (>= 1500 bp) 305 + 31 part # predicted genes (>= 3000 bp) 24 + 1 part

Links

report.html

[3] Assess Genome Quality Using CheckM

Assess Genome Quality with CheckM - v1.0.18

Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.

This app completed without errors in 8m 38s.

Report

View report in separate window

Links

CheckM_Plot.html - Summarized report from CheckM

Files

These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604

CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
full_output.zip - Full output of CheckM
plots.zip - Output plots from CheckM

Assess Genome Quality with CheckM - v1.0.18

Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.

This app completed without errors in 6m 26s.

Report

View report in separate window

Links

CheckM_Plot.html - Summarized report from CheckM

Files

These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604

CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
full_output.zip - Full output of CheckM
plots.zip - Output plots from CheckM

Assess Genome Quality with CheckM - v1.0.18

Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.

This app completed without errors in 10m 12s.

Report

View report in separate window

Links

CheckM_Plot.html - Summarized report from CheckM

Files

These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604

CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
full_output.zip - Full output of CheckM
plots.zip - Output plots from CheckM

[4] Annotate Assemblies and Genomes

Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)

Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.

This app completed without errors in 2m 55s.

Input Objects

Assembly or Genome

RoseilNK_assembly

Parameters

Scientific name

Ca. Roseilinea sp. NK_OTU-006

Kingdom

Bacteria

Genus

Genetic code

Raw product

Fast

Min.contig size

E-value

Rfam

No rRNA

No tRNA

Output Objects

Output genome

Ca.Roseil_genome

Objects

Created Object Name	Type	Description
Ca.Roseil_genome	Genome	Annotated Genome

Summary

Annotated Genome saved to: joval:narrative_1586534178841/Ca.Roseil_genome Number of genes predicted: 3137 Number of protein coding genes: 3087 Number of genes with non-hypothetical function: 1987 Number of genes with EC-number: 1258 Number of genes with Seed Subsystem Ontology: 955 Average protein length: 352 aa.

Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)

The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)

Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.

This app completed without errors in 4m 45s.

Input Objects

Assembly or Genome

JP3_7_PGTN01.1.fsa_nt_assembly

Parameters

Scientific name

JP3_7 C3 Thermofonsia

Kingdom

Bacteria

Genus

Genetic code

Raw product

Fast

Min.contig size

E-value

Rfam

No rRNA

No tRNA

Output Objects

Output genome

JP3_7_genome

Objects

Created Object Name	Type	Description
JP3_7_genome	Genome	Annotated Genome

Summary

Annotated Genome saved to: joval:narrative_1586534178841/JP3_7_genome Number of genes predicted: 3021 Number of protein coding genes: 2977 Number of genes with non-hypothetical function: 1853 Number of genes with EC-number: 1192 Number of genes with Seed Subsystem Ontology: 918 Average protein length: 314 aa.

Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)

The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

Annotate Multiple Microbial Assemblies

Annotate bacterial or archaeal assemblies and/or assembly sets using RASTtk.

This app completed without errors in 10m 43s.

Input Objects

Assemblies/AssemblySets

RoseilNK_assembly

JP3_7_PGTN01.1.fsa_nt_assembly

Ca_Roseilinea_gracile_MS_genome_assembly

Parameters

Assembly list

Scientific Name

Ca_Roseilinea_spp.

Domain

Genetic Code

Call rRNAs

Call tRNA trnascan

Call selenoproteins

Call pyrrolysoproteins

Call SEED repeat region

Call strep suis repeats

Call strep pneumo repeats

Call crisprs

Call glimmer3

Call prodigal

Annotate proteins kmer v2

Annotate proteins Kmer v1

Annotate proteins similarity

Resolve overlapping features

Call features prophage phispy

Output Objects

Optional Output GenomeSet Name

Ca_Roseilinea_spp.Genomeset

Objects

Created Object Name	Type	Description
RoseilNK_assembly.RAST	Genome	Annotated genome
JP3_7_PGTN01.1.fsa_nt_assembly.RAST	Genome	Annotated genome
Ca_Roseilinea_gracile_MS_genome_assembly.RAST	Genome	Annotated genome
Ca_Roseilinea_spp.Genomeset	GenomeSet	Genome Set

Summary

The RAST algorithm was applied to annotating a genome sequence comprised of 117 contigs containing 3642138 nucleotides.
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3673 new features were called, of which 350 are non-coding.
Output genome has the following feature types:
Coding gene 3323
Non-coding crispr_array 2
Non-coding crispr_repeat 91
Non-coding crispr_spacer 89
Non-coding repeat 124
Non-coding rna 44
Overall, the genes have 1337 distinct functions.
The genes include 1779 genes with a SEED annotation ontology across 808 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
RoseilNK_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 708 contigs containing 3379197 nucleotides.
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3627 new features were called, of which 105 are non-coding.
Output genome has the following feature types:
Coding gene 3522
Non-coding crispr_array 1
Non-coding crispr_repeat 10
Non-coding crispr_spacer 9
Non-coding repeat 44
Non-coding rna 41
Overall, the genes have 1360 distinct functions.
The genes include 1832 genes with a SEED annotation ontology across 821 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
JP3_7_PGTN01.1.fsa_nt_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 439 contigs containing 2635638 nucleotides.
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3027 new features were called, of which 275 are non-coding.
Output genome has the following feature types:
Coding gene 2752
Non-coding crispr_array 1
Non-coding crispr_repeat 12
Non-coding crispr_spacer 11
Non-coding repeat 209
Non-coding rna 42
Overall, the genes have 1099 distinct functions.
The genes include 1397 genes with a SEED annotation ontology across 696 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Ca_Roseilinea_gracile_MS_genome_assembly succeeded!

Files

These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604

annotation_report.Ca_Roseilinea_spp.Genomeset - Microbial Annotation Report

Annotate Multiple Microbial Genomes

Annotate or re-annotate bacterial or archaeal genomes and/or genome sets using RASTtk.

This app completed without errors in 33m 40s.

Input Objects

Genomes/GenomeSets

Ca_Roseilinea.genomeset

Parameters

Genome list

Call rRNAs

Call tRNA trnascan

Call selenoproteins

Call pyrrolysoproteins

Call SEED repeat region

Call strep suis repeats

Call strep pneumo repeats

Call crisprs

Call glimmer3

Call prodigal

Annotate proteins kmer v2

Annotate proteins Kmer v1

Annotate proteins similarity

Retain old annotations for hypotheticals

Resolve overlapping features

Call features prophage phispy

Output Objects

Optional Output GenomeSet Name

Ca_Roseilinea_neighbors_annotated.Genomeset

Objects

Created Object Name	Type	Description
GCF_000516515.1.RAST	Genome	Annotated genome
GCF_000018865.1.RAST	Genome	Annotated genome
GCF_000826145.1.RAST	Genome	Annotated genome
GCF_000183545.2.RAST	Genome	Annotated genome
GCF_000016665.1.RAST	Genome	Annotated genome
GCF_000024985.1.RAST	Genome	Annotated genome
GCF_000152145.1.RAST	Genome	Annotated genome
GCF_000383875.1.RAST	Genome	Annotated genome
GCF_000313915.1.RAST	Genome	Annotated genome
GCF_000281175.1.RAST	Genome	Annotated genome
GCF_001306135.1.RAST	Genome	Annotated genome
GCF_000526415.1.RAST	Genome	Annotated genome
GCF_001293545.1.RAST	Genome	Annotated genome
GCF_000017805.1.RAST	Genome	Annotated genome
GCF_001306145.1.RAST	Genome	Annotated genome
GCF_000745125.1.RAST	Genome	Annotated genome
GCF_000199675.1.RAST	Genome	Annotated genome
GCF_900187885.1.RAST	Genome	Annotated genome
GCF_001483965.1.RAST	Genome	Annotated genome
JP3_7_genome.RAST	Genome	Annotated genome
Ca_Roseilinea_gracile_MS_genome.RAST	Genome	Annotated genome
Chloroflexus_sp._MS-G.RAST	Genome	Annotated genome
GCF_000021945.1.RAST	Genome	Annotated genome
Ca.RoseilNK_genome.RAST	Genome	Annotated genome
Ca_Roseilinea_neighbors_annotated.Genomeset	GenomeSet	Genome Set

Summary

The RAST algorithm was applied to annotating an existing genome: Chloroflexus sp. Y-396-1.
The sequence for this genome is comprised of 1 contigs containing 4890986 nucleotides.
The input genome has 3710 existing coding features and 139 existing non-coding features.
Input genome has the following feature types:
Non-coding assembly_gap 3
Non-coding gene 60
Non-coding misc_binding 1
Non-coding ncRNA 1
Non-coding rRNA 9
Non-coding regulatory 11
Non-coding repeat_region 4
Non-coding tRNA 49
Non-coding tmRNA 1
gene 3710
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3710 coding features and 139 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3710
Non-coding assembly_gap 3
Non-coding gene 60
Non-coding misc_binding 1
Non-coding ncRNA 1
Non-coding rRNA 9
Non-coding regulatory 11
Non-coding repeat_region 4
Non-coding tRNA 49
Non-coding tmRNA 1
Overall, the genes have 2555 distinct functions.
The genes include 1769 genes with a SEED annotation ontology across 1057 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000516515.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Chloroflexus aurantiacus J-10-fl.
The sequence for this genome is comprised of 1 contigs containing 5258541 nucleotides.
The input genome has 3853 existing coding features and 728 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 137
Non-coding misc_binding 6
Non-coding ncRNA 2
Non-coding rRNA 9
Non-coding repeat_region 5
Non-coding sig_peptide 519
Non-coding tRNA 49
Non-coding tmRNA 1
gene 3853
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3853 coding features and 728 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3853
Non-coding gene 137
Non-coding misc_binding 6
Non-coding ncRNA 2
Non-coding rRNA 9
Non-coding repeat_region 5
Non-coding sig_peptide 519
Non-coding tRNA 49
Non-coding tmRNA 1
Overall, the genes have 2763 distinct functions.
The genes include 1653 genes with a SEED annotation ontology across 1089 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000018865.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Sporosarcina koreensis.
The sequence for this genome is comprised of 8 contigs containing 2912426 nucleotides.
The input genome has 2924 existing coding features and 0 existing non-coding features.
NOTE: Older input genomes did not properly separate coding and non-coding features.
Input genome has the following feature types:
Non-coding gene 93
gene 2792
pseudogene 39
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2792 coding features and 132 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2792
Non-coding gene 93
Non-coding pseudogene 39
Overall, the genes have 1688 distinct functions.
The genes include 1954 genes with a SEED annotation ontology across 978 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000826145.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Thermaerobacter subterraneus DSM 13965.
The sequence for this genome is comprised of 2 contigs containing 2888741 nucleotides.
The input genome has 2369 existing coding features and 131 existing non-coding features.
Input genome has the following feature types:
Non-coding assembly_gap 3
Non-coding gene 57
Non-coding ncRNA 3
Non-coding rRNA 7
Non-coding regulatory 10
Non-coding repeat_region 4
Non-coding tRNA 46
Non-coding tmRNA 1
gene 2369
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2369 coding features and 131 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2369
Non-coding assembly_gap 3
Non-coding gene 57
Non-coding ncRNA 3
Non-coding rRNA 7
Non-coding regulatory 10
Non-coding repeat_region 4
Non-coding tRNA 46
Non-coding tmRNA 1
Overall, the genes have 1429 distinct functions.
The genes include 1637 genes with a SEED annotation ontology across 847 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000183545.2 succeeded!

The RAST algorithm was applied to annotating an existing genome: Roseiflexus sp. RS-1.
The sequence for this genome is comprised of 1 contigs containing 5801598 nucleotides.
The input genome has 4765 existing coding features and 135 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 57
Non-coding ncRNA 2
Non-coding rRNA 6
Non-coding regulatory 12
Non-coding repeat_region 9
Non-coding tRNA 48
Non-coding tmRNA 1
gene 4765
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 4765 coding features and 135 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 4765
Non-coding gene 57
Non-coding ncRNA 2
Non-coding rRNA 6
Non-coding regulatory 12
Non-coding repeat_region 9
Non-coding tRNA 48
Non-coding tmRNA 1
Overall, the genes have 3013 distinct functions.
The genes include 2091 genes with a SEED annotation ontology across 1094 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000016665.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Sphaerobacter thermophilus DSM 20745.
The sequence for this genome is comprised of 2 contigs containing 3993764 nucleotides.
The input genome has 3468 existing coding features and 134 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 60
Non-coding ncRNA 3
Non-coding rRNA 6
Non-coding regulatory 13
Non-coding repeat_region 1
Non-coding tRNA 50
Non-coding tmRNA 1
gene 3468
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3468 coding features and 134 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3468
Non-coding gene 60
Non-coding ncRNA 3
Non-coding rRNA 6
Non-coding regulatory 13
Non-coding repeat_region 1
Non-coding tRNA 50
Non-coding tmRNA 1
Overall, the genes have 1839 distinct functions.
The genes include 2176 genes with a SEED annotation ontology across 961 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000024985.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Oscillochloris trichoides DG-6.
The sequence for this genome is comprised of 7 contigs containing 4373075 nucleotides.
The input genome has 3486 existing coding features and 263 existing non-coding features.
Input genome has the following feature types:
Non-coding assembly_gap 140
Non-coding gene 52
Non-coding ncRNA 2
Non-coding rRNA 3
Non-coding regulatory 7
Non-coding repeat_region 12
Non-coding tRNA 46
Non-coding tmRNA 1
gene 3486
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3486 coding features and 263 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3486
Non-coding assembly_gap 140
Non-coding gene 52
Non-coding ncRNA 2
Non-coding rRNA 3
Non-coding regulatory 7
Non-coding repeat_region 12
Non-coding tRNA 46
Non-coding tmRNA 1
Overall, the genes have 2023 distinct functions.
The genes include 2008 genes with a SEED annotation ontology across 937 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000152145.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Caldibacillus debilis DSM 16016.
The sequence for this genome is comprised of 40 contigs containing 3059517 nucleotides.
The input genome has 2687 existing coding features and 182 existing non-coding features.
Input genome has the following feature types:
Non-coding assembly_gap 2
Non-coding gene 74
Non-coding misc_binding 10
Non-coding misc_feature 3
Non-coding ncRNA 3
Non-coding rRNA 12
Non-coding regulatory 13
Non-coding repeat_region 6
Non-coding tRNA 58
Non-coding tmRNA 1
gene 2687
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2687 coding features and 182 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2687
Non-coding assembly_gap 2
Non-coding gene 74
Non-coding misc_binding 10
Non-coding misc_feature 3
Non-coding ncRNA 3
Non-coding rRNA 12
Non-coding regulatory 13
Non-coding repeat_region 6
Non-coding tRNA 58
Non-coding tmRNA 1
Overall, the genes have 1742 distinct functions.
The genes include 1780 genes with a SEED annotation ontology across 970 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000383875.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Catellicoccus marimammalium M35/04/3.
The sequence for this genome is comprised of 25 contigs containing 1285866 nucleotides.
The input genome has 1203 existing coding features and 120 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 56
Non-coding misc_binding 2
Non-coding misc_feature 3
Non-coding ncRNA 3
Non-coding rRNA 4
Non-coding regulatory 2
Non-coding repeat_region 1
Non-coding tRNA 48
Non-coding tmRNA 1
gene 1203
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 1203 coding features and 120 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 1203
Non-coding gene 56
Non-coding misc_binding 2
Non-coding misc_feature 3
Non-coding ncRNA 3
Non-coding rRNA 4
Non-coding regulatory 2
Non-coding repeat_region 1
Non-coding tRNA 48
Non-coding tmRNA 1
Overall, the genes have 870 distinct functions.
The genes include 876 genes with a SEED annotation ontology across 550 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000313915.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Caldilinea aerophila DSM 14535 = NBRC 104270.
The sequence for this genome is comprised of 1 contigs containing 5144873 nucleotides.
The input genome has 4103 existing coding features and 126 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 57
Non-coding ncRNA 3
Non-coding rRNA 6
Non-coding regulatory 8
Non-coding repeat_region 4
Non-coding tRNA 47
Non-coding tmRNA 1
gene 4103
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 4103 coding features and 126 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 4103
Non-coding gene 57
Non-coding ncRNA 3
Non-coding rRNA 6
Non-coding regulatory 8
Non-coding repeat_region 4
Non-coding tRNA 47
Non-coding tmRNA 1
Overall, the genes have 1736 distinct functions.
The genes include 3039 genes with a SEED annotation ontology across 948 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000281175.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Herpetosiphon geysericola.
The sequence for this genome is comprised of 46 contigs containing 6140412 nucleotides.
The input genome has 5288 existing coding features and 126 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 52
Non-coding misc_feature 6
Non-coding ncRNA 2
Non-coding rRNA 2
Non-coding regulatory 10
Non-coding repeat_region 6
Non-coding tRNA 47
Non-coding tmRNA 1
gene 5288
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 5288 coding features and 126 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 5288
Non-coding gene 52
Non-coding misc_feature 6
Non-coding ncRNA 2
Non-coding rRNA 2
Non-coding regulatory 10
Non-coding repeat_region 6
Non-coding tRNA 47
Non-coding tmRNA 1
Overall, the genes have 2286 distinct functions.
The genes include 3362 genes with a SEED annotation ontology across 1030 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001306135.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: bacterium JKG1 Bacteria..
The sequence for this genome is comprised of 4 contigs containing 4475263 nucleotides.
The input genome has 3924 existing coding features and 147 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 62
Non-coding misc_binding 1
Non-coding ncRNA 1
Non-coding rRNA 9
Non-coding regulatory 15
Non-coding repeat_region 7
Non-coding tRNA 51
Non-coding tmRNA 1
gene 3924
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3924 coding features and 147 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3924
Non-coding gene 62
Non-coding misc_binding 1
Non-coding ncRNA 1
Non-coding rRNA 9
Non-coding regulatory 15
Non-coding repeat_region 7
Non-coding tRNA 51
Non-coding tmRNA 1
Overall, the genes have 1810 distinct functions.
The genes include 2882 genes with a SEED annotation ontology across 1000 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000526415.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Ardenticatena maritima.
The sequence for this genome is comprised of 308 contigs containing 3569367 nucleotides.
The input genome has 3215 existing coding features and 153 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 66
Non-coding ncRNA 2
Non-coding rRNA 16
Non-coding regulatory 10
Non-coding repeat_region 11
Non-coding tRNA 47
Non-coding tmRNA 1
gene 3215
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3215 coding features and 153 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3215
Non-coding gene 66
Non-coding ncRNA 2
Non-coding rRNA 16
Non-coding regulatory 10
Non-coding repeat_region 11
Non-coding tRNA 47
Non-coding tmRNA 1
Overall, the genes have 1322 distinct functions.
The genes include 2595 genes with a SEED annotation ontology across 807 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001293545.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Roseiflexus castenholzii DSM 13941.
The sequence for this genome is comprised of 1 contigs containing 5723298 nucleotides.
The input genome has 4647 existing coding features and 129 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 57
Non-coding ncRNA 2
Non-coding rRNA 6
Non-coding regulatory 9
Non-coding repeat_region 6
Non-coding tRNA 48
Non-coding tmRNA 1
gene 4647
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 4647 coding features and 129 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 4647
Non-coding gene 57
Non-coding ncRNA 2
Non-coding rRNA 6
Non-coding regulatory 9
Non-coding repeat_region 6
Non-coding tRNA 48
Non-coding tmRNA 1
Overall, the genes have 2964 distinct functions.
The genes include 2058 genes with a SEED annotation ontology across 1082 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000017805.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Thermanaerothrix daxensis.
The sequence for this genome is comprised of 6 contigs containing 3012066 nucleotides.
The input genome has 2745 existing coding features and 122 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 54
Non-coding ncRNA 3
Non-coding rRNA 3
Non-coding regulatory 10
Non-coding repeat_region 4
Non-coding tRNA 47
Non-coding tmRNA 1
gene 2745
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2745 coding features and 122 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2745
Non-coding gene 54
Non-coding ncRNA 3
Non-coding rRNA 3
Non-coding regulatory 10
Non-coding repeat_region 4
Non-coding tRNA 47
Non-coding tmRNA 1
Overall, the genes have 1338 distinct functions.
The genes include 2094 genes with a SEED annotation ontology across 788 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001306145.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Carnobacterium jeotgali MS3.
The sequence for this genome is comprised of 12 contigs containing 2518244 nucleotides.
The input genome has 2348 existing coding features and 246 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 109
Non-coding misc_binding 14
Non-coding misc_feature 3
Non-coding ncRNA 3
Non-coding rRNA 30
Non-coding regulatory 11
Non-coding tRNA 75
Non-coding tmRNA 1
gene 2348
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2348 coding features and 246 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2348
Non-coding gene 109
Non-coding misc_binding 14
Non-coding misc_feature 3
Non-coding ncRNA 3
Non-coding rRNA 30
Non-coding regulatory 11
Non-coding tRNA 75
Non-coding tmRNA 1
Overall, the genes have 1447 distinct functions.
The genes include 1583 genes with a SEED annotation ontology across 845 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000745125.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Anaerolinea thermophila UNI-1.
The sequence for this genome is comprised of 1 contigs containing 3532378 nucleotides.
The input genome has 3125 existing coding features and 336 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 59
Non-coding ncRNA 3
Non-coding rRNA 6
Non-coding regulatory 10
Non-coding repeat_region 208
Non-coding tRNA 49
Non-coding tmRNA 1
gene 3125
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3125 coding features and 336 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3125
Non-coding gene 59
Non-coding ncRNA 3
Non-coding rRNA 6
Non-coding regulatory 10
Non-coding repeat_region 208
Non-coding tRNA 49
Non-coding tmRNA 1
Overall, the genes have 1662 distinct functions.
The genes include 1928 genes with a SEED annotation ontology across 808 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000199675.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Thermoflexus hugenholtzii JAD2.
The sequence for this genome is comprised of 78 contigs containing 3216964 nucleotides.
The input genome has 2899 existing coding features and 123 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 54
Non-coding ncRNA 2
Non-coding rRNA 3
Non-coding regulatory 9
Non-coding repeat_region 6
Non-coding tRNA 48
Non-coding tmRNA 1
gene 2899
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2899 coding features and 123 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2899
Non-coding gene 54
Non-coding ncRNA 2
Non-coding rRNA 3
Non-coding regulatory 9
Non-coding repeat_region 6
Non-coding tRNA 48
Non-coding tmRNA 1
Overall, the genes have 1252 distinct functions.
The genes include 2325 genes with a SEED annotation ontology across 779 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_900187885.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Carnobacterium sp. CP1.
The sequence for this genome is comprised of 2 contigs containing 2614401 nucleotides.
The input genome has 2378 existing coding features and 222 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 106
Non-coding ncRNA 3
Non-coding rRNA 25
Non-coding regulatory 10
Non-coding tRNA 77
Non-coding tmRNA 1
gene 2378
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2378 coding features and 222 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2378
Non-coding gene 106
Non-coding ncRNA 3
Non-coding rRNA 25
Non-coding regulatory 10
Non-coding tRNA 77
Non-coding tmRNA 1
Overall, the genes have 1492 distinct functions.
The genes include 1625 genes with a SEED annotation ontology across 874 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001483965.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: JP3_7 C3 Thermofonsia.
The sequence for this genome is comprised of 708 contigs containing 3379197 nucleotides.
The input genome has 2977 existing coding features and 44 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 44
gene 2977
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2977 coding features and 44 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2977
Non-coding gene 44
Overall, the genes have 1245 distinct functions.
The genes include 2402 genes with a SEED annotation ontology across 772 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
JP3_7_genome succeeded!

Some RAST tools will not run unless the taxonomic domain is Archaea, Bacteria, or Virus.
These tools include: call selenoproteins, call pyrroysoproteins, call crisprs, and call prophage phispy features.
You may not get the results you were expecting with your current domain of Unknown.
The RAST algorithm was applied to annotating an existing genome: 'Ca. Roseilinea gracile' YNP-MS-B-OTU-6, metagenome bin-6 (2.5kb). Bacteria..
The sequence for this genome is comprised of 439 contigs containing 2635638 nucleotides.
The input genome has 2329 existing coding features and 44 existing non-coding features.
Input genome has the following feature types:
Non-coding rRNA 3
Non-coding tRNA 41
gene 2329
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2329 coding features and 44 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 2329
Non-coding rRNA 3
Non-coding tRNA 41
Overall, the genes have 1031 distinct functions.
The genes include 1887 genes with a SEED annotation ontology across 657 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Ca_Roseilinea_gracile_MS_genome succeeded!

The RAST algorithm was applied to annotating an existing genome: Chloroflexus sp. MS-G.
The sequence for this genome is comprised of 251 contigs containing 4770266 nucleotides.
The input genome has 3915 existing coding features and 0 existing non-coding features.
NOTE: Older input genomes did not properly separate coding and non-coding features.
Input genome has the following feature types:
Non-coding gene 53
gene 3659
pseudogene 203
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3659 coding features and 256 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3659
Non-coding gene 53
Non-coding pseudogene 203
Overall, the genes have 2567 distinct functions.
The genes include 1706 genes with a SEED annotation ontology across 1052 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Chloroflexus_sp._MS-G succeeded!

The RAST algorithm was applied to annotating an existing genome: Chloroflexus aggregans DSM 9485.
The sequence for this genome is comprised of 1 contigs containing 4684931 nucleotides.
The input genome has 3811 existing coding features and 134 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 58
Non-coding ncRNA 1
Non-coding rRNA 9
Non-coding regulatory 9
Non-coding repeat_region 9
Non-coding tRNA 47
Non-coding tmRNA 1
gene 3811
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3811 coding features and 134 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3811
Non-coding gene 58
Non-coding ncRNA 1
Non-coding rRNA 9
Non-coding regulatory 9
Non-coding repeat_region 9
Non-coding tRNA 47
Non-coding tmRNA 1
Overall, the genes have 2650 distinct functions.
The genes include 1751 genes with a SEED annotation ontology across 1061 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000021945.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Ca. Roseilinea sp. NK_OTU-006.
The sequence for this genome is comprised of 117 contigs containing 3642138 nucleotides.
The input genome has 3087 existing coding features and 50 existing non-coding features.
Input genome has the following feature types:
Non-coding gene 50
gene 3087
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3087 coding features and 50 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
Coding gene 3087
Non-coding gene 50
Overall, the genes have 1327 distinct functions.
The genes include 2451 genes with a SEED annotation ontology across 800 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Ca.RoseilNK_genome succeeded!

Files

These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604

annotation_report.Ca_Roseilinea_neighbors_annotated.Genomeset - Microbial Annotation Report

Annotate Domains in a Genome

Annotate a Genome object with protein domains from widely used domain libraries.

This app completed without errors in 2h 16m 54s.

Objects

Created Object Name	Type	Description
Ca.Roseil_annotation_domains	DomainAnnotation	Domain Annotations

Summary

Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/19/1 Running domain search against library 2959/18/1 Running domain search against library 2959/24/1 Running domain search against library 2959/25/1 Running domain search against library 2959/23/1 Running domain search against library 2959/7/7 Running domain search against library 2959/20/1 Running domain search against library 2959/17/1 Running domain search against library 2959/21/1 Running domain search against library 2959/22/1

Annotate Domains in a Genome

Annotate a Genome object with protein domains from widely used domain libraries.

This app completed without errors in 1h 52m 10s.

Objects

Created Object Name	Type	Description
JP3_7_annotation_domains	DomainAnnotation	Domain Annotations

Summary

Annotate Domains in a GenomeSet

Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.

This app completed without errors in 1d 6h 17m 18s.

Summary

Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7

Annotate Domains in a GenomeSet

Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.

This app completed without errors in 14h 16m 43s.

Summary

[5] Compute ANI

Compute ANI with FastANI

Allows users to compute fast whole-genome Average Nucleotide Identity (ANI) estimation.

This app completed without errors in 5m 4s.

Report

View report in separate window

Links

index.html - FastANI HTML report

[6] Construct a Genome Tree

Insert Genome Into Species Tree 2.1.10

Add one or more genomes to a KBase species tree.

This app completed without errors in 3m 14s.

Report

View report in separate window

Links

Ca_Roseilinea_neighbors.html

Files

These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604

Ca_Roseilinea_neighbors.newick
Ca_Roseilinea_neighbors-labels.newick
Ca_Roseilinea_neighbors.png
Ca_Roseilinea_neighbors.pdf

[7] Genome Visualization

Circular Genome Visualization Tool

Generate a map and annotations of circular genomes using CGView.

This app completed without errors in 4m 10s.

Input Objects

Genome

Ca.RoseilNK_genome

Parameters

Linear

GC Content

GC Skew

AT Content

AT Skew

Average

Scale

Orfs

Combined Orfs ('Orfs' must be selected)

Orf Size ('Orfs' must be selected)

100

Tick Density

0.5

Details

Legend

Condensed

Feature Labels

Orf Labels ('Orfs' must be selected)

Show Sequence Features

Report

View report in separate window

Links

KBase_derived_Ca.RoseilNK_genome Map

Files

These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604

KBase_derived_Ca.RoseilNK_genome.png
KBase_derived_Ca.RoseilNK_genome.jpg
KBase_derived_Ca.RoseilNK_genome.svg

[8] Construct Pangenome

Compute Pangenome

Allows users to compute a pangenome from a set of individual genomes.

This app completed without errors in 20m 22s.

Objects

Created Object Name	Type	Description
Ca_Roseilinea_24genomes.Pangenome	Pangenome	Pangenome

Summary

Pangenome saved to joval:narrative_1586534178841/Ca_Roseilinea_24genomes.Pangenome

Ca_Roseilinea_24genomes.Pangenome

v1 - KBaseGenomes.Pangenome-4.1

The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

[9] Build Metabolic Model

Build Metabolic Model

Generate a draft metabolic model based on an annotated genome.

This app completed without errors in 1m 36s.

Input Objects

Genome

Ca.RoseilNK_genome.RAST

Gapfilling Media (defaults to complete media)

Parameters

Template for reconstruction

core

Gapfill model?

Custom flux bounds

Media supplement

Minimum reaction flux

0.1

Output Objects

Output model

Ca.RoseilineaNK_metabolicmodel

Objects

Created Object Name	Type	Description
Ca.RoseilineaNK_metabolicmodel	FBAModel	FBAModel-12 Ca.RoseilineaNK_metabolicmodel
Ca.RoseilineaNK_metabolicmodel.gf.0	FBA	FBA-13 Ca.RoseilineaNK_metabolicmodel.gf.0

Report

Output from Build Metabolic Model

The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

[10] View Genome Function Profiles

View Function Profile for a Phylogenetic Tree - v1.4.0

Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.

This app completed without errors in 2m 26s.

Input Objects

Tree

Genome_RAST_annotated_Tree

Parameters

Domain Selection

SEED

Custom Domains

Custom Domain Groups (COG)

Custom Domain Groups (PFAM)

Custom Domain Groups (TIGRFAM)

SEED Functional Group

Display genome as

sci_name

View values as

raw_count

View table values as

Heatmap Log Base (optional)

E-value Upper Limit

1e-05

Count annotated genes using

Genes requiring COG annot

Genes requiring PFAM annot

Genes requiring TIGR annot

Genes with validated SEED annot

Count SEED hypothetical

Empty categories

Skip missing genomes

Enforce genome version match

Report

View report in separate window

Links

domain_profile_report.html

View Function Profile for a Phylogenetic Tree - v1.4.0

Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.

This app completed without errors in 3m 22s.

Input Objects

Tree

Genome_RAST_annotated_Tree

Parameters

Domain Selection

TIGR

Custom Domains

Custom Domain Groups (COG)

Custom Domain Groups (PFAM)

Custom Domain Groups (TIGRFAM)

SEED Functional Group

Display genome as

sci_name

View values as

raw_count

View table values as

Heatmap Log Base (optional)

E-value Upper Limit

1e-05

Count annotated genes using

Genes requiring COG annot

Genes requiring PFAM annot

Genes requiring TIGR annot

Genes with validated SEED annot

Count SEED hypothetical

Empty categories

Skip missing genomes

Enforce genome version match

Report

View report in separate window

Links

domain_profile_report.html

View Function Profile for a Phylogenetic Tree - v1.4.0

Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.

This app completed without errors in 2m 11s.

Input Objects

Tree

Genome_RAST_annotated_Tree

Parameters

Domain Selection

SEED

Custom Domains

Custom Domain Groups (COG)

Custom Domain Groups (PFAM)

Custom Domain Groups (TIGRFAM)

SEED Functional Group

Display genome as

sci_name

View values as

raw_count

View table values as

Heatmap Log Base (optional)

E-value Upper Limit

1e-05

Count annotated genes using

Genes requiring COG annot

Genes requiring PFAM annot

Genes requiring TIGR annot

Genes with validated SEED annot

Count SEED hypothetical

Empty categories

Skip missing genomes

Enforce genome version match

Report

View report in separate window

Links

domain_profile_report.html

View Function Profile for a Phylogenetic Tree - v1.4.0

Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.

This app completed without errors in 3m 32s.

Input Objects

Tree

Genome_RAST_annotated_Tree

Parameters

Domain Selection

COG

Custom Domains

Custom Domain Groups (COG)

Custom Domain Groups (PFAM)

Custom Domain Groups (TIGRFAM)

SEED Functional Group

Display genome as

sci_name

View values as

perc_annot

View table values as

Heatmap Log Base (optional)

E-value Upper Limit

1e-05

Count annotated genes using

Genes requiring COG annot

Genes requiring PFAM annot

Genes requiring TIGR annot

Genes with validated SEED annot

Count SEED hypothetical

Empty categories

Skip missing genomes

Enforce genome version match

Report

View report in separate window

Links

domain_profile_report.html

View Function Profile for a Phylogenetic Tree - v1.4.0

Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.

This app completed without errors in 3m 1s.

Input Objects

Tree

Genome_RAST_annotated_Tree

Parameters

Domain Selection

SEED

Custom Domains

Custom Domain Groups (COG)

Custom Domain Groups (PFAM)

Custom Domain Groups (TIGRFAM)

SEED Functional Group

Display genome as

obj_name_sci_name

View values as

perc_annot

View table values as

Heatmap Log Base (optional)

E-value Upper Limit

1e-05

Count annotated genes using

Genes requiring COG annot

Genes requiring PFAM annot

Genes requiring TIGR annot

Genes with validated SEED annot

Count SEED hypothetical

Empty categories

Skip missing genomes

Enforce genome version match

Report

View report in separate window

Links

domain_profile_report.html

Apps

Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
- Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30: 2068 2069. doi:10.1093/bioinformatics/btu153
Annotate Domains in a Genome
- Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
- Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
- Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344
- Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234
- Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922
- Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949
- Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129
- Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043
- Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631
Annotate Domains in a GenomeSet
- Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
- Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
- Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344
- Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234
- Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922
- Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949
- Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129
- Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043
- Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631
Annotate Multiple Microbial Assemblies
- [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
- [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
- [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
- [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
- [5] Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
- [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
- [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
- [8] Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34: D32 D36. doi:10.1093/nar/gkj014
- [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
- [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
- [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
- [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
- [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
Annotate Multiple Microbial Genomes
- [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
- [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
- [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
- [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
- [5] Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
- [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
- [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
- [8] Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34: D32 D36. doi:10.1093/nar/gkj014
- [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
- [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
- [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
- [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
- [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
Assess Genome Quality with CheckM - v1.0.18
- Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
- CheckM source:
- Additional info:
Assess Quality of Assemblies with QUAST - v4.4
- [1] Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29: 1072 1075. doi:10.1093/bioinformatics/btt086
- [2] Mikheenko A, Valin G, Prjibelski A, Saveliev V, Gurevich A. Icarus: visualizer for de novo assembly evaluation. Bioinformatics. 2016;32: 3321 3323. doi:10.1093/bioinformatics/btw379
Build Metabolic Model
- [1] Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28: 977 982. doi:10.1038/nbt.1672
- [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
- [3] Latendresse M. Efficiently gap-filling reaction networks. BMC Bioinformatics. 2014;15: 225. doi:10.1186/1471-2105-15-225
- [4] Dreyfuss JM, Zucker JD, Hood HM, Ocasio LR, Sachs MS, Galagan JE. Reconstruction and Validation of a Genome-Scale Metabolic Model for the Filamentous Fungus Neurospora crassa Using FARM. PLOS Computational Biology. 2013;9: e1003126. doi:10.1371/journal.pcbi.1003126
- [5] Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003;5: 264 276.
Circular Genome Visualization Tool

no citations
Compute ANI with FastANI
- [1] Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries. 2017; doi:10.1101/225342
- [2] Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57: 81 91. doi:10.1099/ijs.0.64483-0
- FastANI module and source code:
Compute Pangenome
- Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
Insert Genome Into SpeciesTree - v2.2.0
- Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490
View Function Profile for a Phylogenetic Tree - v1.4.0
- Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163