Generated September 23, 2020

Draft genome of Candidatus Roseilinea sp. NK_OTU-006 recovered from metagenomic data of a hot spring microbial mat

Table of Contents

  1. Data Import
  2. Assess Quality of Assemblies
  3. Assess Genome Quality using CheckM
  4. Annotate Assemblies and Genomes
  5. Compute ANI
  6. Construct a Genome Tree
  7. Genome Vizualization
  8. Compute Pangenome
  9. Build metabolic Model
  10. View Genome Function Profiles

[1] Data Import

v1 - KBaseGenomeAnnotations.Assembly-5.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604
v1 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604
v1 - KBaseGenomeAnnotations.Assembly-5.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604
v1 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604
v1 - KBaseGenomeAnnotations.Assembly-5.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604
v1 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

[2] Assess Quality of Assemblies

Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.
This app completed without errors in 1m 26s.
Summary
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly RoseilNK_assembly # contigs (>= 0 bp) 117 # contigs (>= 1000 bp) 117 # contigs (>= 10000 bp) 48 # contigs (>= 100000 bp) 9 # contigs (>= 1000000 bp) 0 Total length (>= 0 bp) 3642138 Total length (>= 1000 bp) 3642138 Total length (>= 10000 bp) 3396310 Total length (>= 100000 bp) 1677642 Total length (>= 1000000 bp) 0 # contigs 117 Largest contig 332664 Total length 3642138 GC (%) 63.39 N50 94936 N75 56899 L50 11 L75 24 # N's per 100 kbp 0.00 # predicted genes (unique) 3161 # predicted genes (>= 0 bp) 3126 + 39 part # predicted genes (>= 300 bp) 2836 + 36 part # predicted genes (>= 1500 bp) 506 + 6 part # predicted genes (>= 3000 bp) 59 + 0 part
Links
Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.
This app completed without errors in 1m 49s.
Summary
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly JP3_7_PGTN01.1.fsa_nt_assembly # contigs (>= 0 bp) 708 # contigs (>= 1000 bp) 251 # contigs (>= 10000 bp) 91 # contigs (>= 100000 bp) 0 # contigs (>= 1000000 bp) 0 Total length (>= 0 bp) 3379197 Total length (>= 1000 bp) 3067753 Total length (>= 10000 bp) 2638098 Total length (>= 100000 bp) 0 Total length (>= 1000000 bp) 0 # contigs 708 Largest contig 94883 Total length 3379197 GC (%) 63.83 N50 28546 N75 13093 L50 40 L75 82 # N's per 100 kbp 0.00 # predicted genes (unique) 3300 # predicted genes (>= 0 bp) 3022 + 278 part # predicted genes (>= 300 bp) 2716 + 253 part # predicted genes (>= 1500 bp) 397 + 11 part # predicted genes (>= 3000 bp) 33 + 0 part
Links
Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.
This app completed without errors in 1m 34s.
Summary
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly Ca_Roseilinea_gracile_MS_genome_assembly # contigs (>= 0 bp) 439 # contigs (>= 1000 bp) 439 # contigs (>= 10000 bp) 62 # contigs (>= 100000 bp) 0 # contigs (>= 1000000 bp) 0 Total length (>= 0 bp) 2635638 Total length (>= 1000 bp) 2635638 Total length (>= 10000 bp) 849391 Total length (>= 100000 bp) 0 Total length (>= 1000000 bp) 0 # contigs 439 Largest contig 37322 Total length 2635638 GC (%) 62.95 N50 7110 N75 4420 L50 117 L75 237 # N's per 100 kbp 74.59 # predicted genes (unique) 2480 # predicted genes (>= 0 bp) 2316 + 165 part # predicted genes (>= 300 bp) 2014 + 143 part # predicted genes (>= 1500 bp) 305 + 31 part # predicted genes (>= 3000 bp) 24 + 1 part
Links

[3] Assess Genome Quality Using CheckM

Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 8m 38s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 6m 26s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 10m 12s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM

[4] Annotate Assemblies and Genomes

Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 2m 55s.
Objects
Created Object Name Type Description
Ca.Roseil_genome Genome Annotated Genome
Summary
Annotated Genome saved to: joval:narrative_1586534178841/Ca.Roseil_genome Number of genes predicted: 3137 Number of protein coding genes: 3087 Number of genes with non-hypothetical function: 1987 Number of genes with EC-number: 1258 Number of genes with Seed Subsystem Ontology: 955 Average protein length: 352 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/59604
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 4m 45s.
Objects
Created Object Name Type Description
JP3_7_genome Genome Annotated Genome
Summary
Annotated Genome saved to: joval:narrative_1586534178841/JP3_7_genome Number of genes predicted: 3021 Number of protein coding genes: 2977 Number of genes with non-hypothetical function: 1853 Number of genes with EC-number: 1192 Number of genes with Seed Subsystem Ontology: 918 Average protein length: 314 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/59604
Annotate bacterial or archaeal assemblies and/or assembly sets using RASTtk.
This app completed without errors in 10m 43s.
Objects
Created Object Name Type Description
RoseilNK_assembly.RAST Genome Annotated genome
JP3_7_PGTN01.1.fsa_nt_assembly.RAST Genome Annotated genome
Ca_Roseilinea_gracile_MS_genome_assembly.RAST Genome Annotated genome
Ca_Roseilinea_spp.Genomeset GenomeSet Genome Set
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 117 contigs containing 3642138 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3673 new features were called, of which 350 are non-coding.
Output genome has the following feature types:
	Coding gene                     3323 
	Non-coding crispr_array            2 
	Non-coding crispr_repeat          91 
	Non-coding crispr_spacer          89 
	Non-coding repeat                124 
	Non-coding rna                    44 
Overall, the genes have 1337 distinct functions. 
The genes include 1779 genes with a SEED annotation ontology across 808 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
RoseilNK_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 708 contigs containing 3379197 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3627 new features were called, of which 105 are non-coding.
Output genome has the following feature types:
	Coding gene                     3522 
	Non-coding crispr_array            1 
	Non-coding crispr_repeat          10 
	Non-coding crispr_spacer           9 
	Non-coding repeat                 44 
	Non-coding rna                    41 
Overall, the genes have 1360 distinct functions. 
The genes include 1832 genes with a SEED annotation ontology across 821 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
JP3_7_PGTN01.1.fsa_nt_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 439 contigs containing 2635638 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3027 new features were called, of which 275 are non-coding.
Output genome has the following feature types:
	Coding gene                     2752 
	Non-coding crispr_array            1 
	Non-coding crispr_repeat          12 
	Non-coding crispr_spacer          11 
	Non-coding repeat                209 
	Non-coding rna                    42 
Overall, the genes have 1099 distinct functions. 
The genes include 1397 genes with a SEED annotation ontology across 696 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Ca_Roseilinea_gracile_MS_genome_assembly succeeded!

Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604
  • annotation_report.Ca_Roseilinea_spp.Genomeset - Microbial Annotation Report
Annotate or re-annotate bacterial or archaeal genomes and/or genome sets using RASTtk.
This app completed without errors in 33m 40s.
Objects
Created Object Name Type Description
GCF_000516515.1.RAST Genome Annotated genome
GCF_000018865.1.RAST Genome Annotated genome
GCF_000826145.1.RAST Genome Annotated genome
GCF_000183545.2.RAST Genome Annotated genome
GCF_000016665.1.RAST Genome Annotated genome
GCF_000024985.1.RAST Genome Annotated genome
GCF_000152145.1.RAST Genome Annotated genome
GCF_000383875.1.RAST Genome Annotated genome
GCF_000313915.1.RAST Genome Annotated genome
GCF_000281175.1.RAST Genome Annotated genome
GCF_001306135.1.RAST Genome Annotated genome
GCF_000526415.1.RAST Genome Annotated genome
GCF_001293545.1.RAST Genome Annotated genome
GCF_000017805.1.RAST Genome Annotated genome
GCF_001306145.1.RAST Genome Annotated genome
GCF_000745125.1.RAST Genome Annotated genome
GCF_000199675.1.RAST Genome Annotated genome
GCF_900187885.1.RAST Genome Annotated genome
GCF_001483965.1.RAST Genome Annotated genome
JP3_7_genome.RAST Genome Annotated genome
Ca_Roseilinea_gracile_MS_genome.RAST Genome Annotated genome
Chloroflexus_sp._MS-G.RAST Genome Annotated genome
GCF_000021945.1.RAST Genome Annotated genome
Ca.RoseilNK_genome.RAST Genome Annotated genome
Ca_Roseilinea_neighbors_annotated.Genomeset GenomeSet Genome Set
Summary
The RAST algorithm was applied to annotating an existing genome: Chloroflexus sp. Y-396-1. 
The sequence for this genome is comprised of 1 contigs containing 4890986 nucleotides. 
The input genome has 3710 existing coding features and 139 existing non-coding features.
Input genome has the following feature types:
	Non-coding assembly_gap            3 
	Non-coding gene                   60 
	Non-coding misc_binding            1 
	Non-coding ncRNA                   1 
	Non-coding rRNA                    9 
	Non-coding regulatory             11 
	Non-coding repeat_region           4 
	Non-coding tRNA                   49 
	Non-coding tmRNA                   1 
	gene                            3710 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3710 coding features and 139 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3710 
	Non-coding assembly_gap            3 
	Non-coding gene                   60 
	Non-coding misc_binding            1 
	Non-coding ncRNA                   1 
	Non-coding rRNA                    9 
	Non-coding regulatory             11 
	Non-coding repeat_region           4 
	Non-coding tRNA                   49 
	Non-coding tmRNA                   1 
Overall, the genes have 2555 distinct functions. 
The genes include 1769 genes with a SEED annotation ontology across 1057 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000516515.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Chloroflexus aurantiacus J-10-fl. 
The sequence for this genome is comprised of 1 contigs containing 5258541 nucleotides. 
The input genome has 3853 existing coding features and 728 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                  137 
	Non-coding misc_binding            6 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    9 
	Non-coding repeat_region           5 
	Non-coding sig_peptide           519 
	Non-coding tRNA                   49 
	Non-coding tmRNA                   1 
	gene                            3853 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3853 coding features and 728 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3853 
	Non-coding gene                  137 
	Non-coding misc_binding            6 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    9 
	Non-coding repeat_region           5 
	Non-coding sig_peptide           519 
	Non-coding tRNA                   49 
	Non-coding tmRNA                   1 
Overall, the genes have 2763 distinct functions. 
The genes include 1653 genes with a SEED annotation ontology across 1089 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000018865.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Sporosarcina koreensis. 
The sequence for this genome is comprised of 8 contigs containing 2912426 nucleotides. 
The input genome has 2924 existing coding features and 0 existing non-coding features.
NOTE: Older input genomes did not properly separate coding and non-coding features.
Input genome has the following feature types:
	Non-coding gene                   93 
	gene                            2792 
	pseudogene                        39 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2792 coding features and 132 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2792 
	Non-coding gene                   93 
	Non-coding pseudogene             39 
Overall, the genes have 1688 distinct functions. 
The genes include 1954 genes with a SEED annotation ontology across 978 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000826145.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Thermaerobacter subterraneus DSM 13965. 
The sequence for this genome is comprised of 2 contigs containing 2888741 nucleotides. 
The input genome has 2369 existing coding features and 131 existing non-coding features.
Input genome has the following feature types:
	Non-coding assembly_gap            3 
	Non-coding gene                   57 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    7 
	Non-coding regulatory             10 
	Non-coding repeat_region           4 
	Non-coding tRNA                   46 
	Non-coding tmRNA                   1 
	gene                            2369 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2369 coding features and 131 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2369 
	Non-coding assembly_gap            3 
	Non-coding gene                   57 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    7 
	Non-coding regulatory             10 
	Non-coding repeat_region           4 
	Non-coding tRNA                   46 
	Non-coding tmRNA                   1 
Overall, the genes have 1429 distinct functions. 
The genes include 1637 genes with a SEED annotation ontology across 847 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000183545.2 succeeded!

The RAST algorithm was applied to annotating an existing genome: Roseiflexus sp. RS-1. 
The sequence for this genome is comprised of 1 contigs containing 5801598 nucleotides. 
The input genome has 4765 existing coding features and 135 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   57 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    6 
	Non-coding regulatory             12 
	Non-coding repeat_region           9 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
	gene                            4765 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 4765 coding features and 135 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     4765 
	Non-coding gene                   57 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    6 
	Non-coding regulatory             12 
	Non-coding repeat_region           9 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
Overall, the genes have 3013 distinct functions. 
The genes include 2091 genes with a SEED annotation ontology across 1094 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000016665.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Sphaerobacter thermophilus DSM 20745. 
The sequence for this genome is comprised of 2 contigs containing 3993764 nucleotides. 
The input genome has 3468 existing coding features and 134 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   60 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    6 
	Non-coding regulatory             13 
	Non-coding repeat_region           1 
	Non-coding tRNA                   50 
	Non-coding tmRNA                   1 
	gene                            3468 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3468 coding features and 134 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3468 
	Non-coding gene                   60 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    6 
	Non-coding regulatory             13 
	Non-coding repeat_region           1 
	Non-coding tRNA                   50 
	Non-coding tmRNA                   1 
Overall, the genes have 1839 distinct functions. 
The genes include 2176 genes with a SEED annotation ontology across 961 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000024985.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Oscillochloris trichoides DG-6. 
The sequence for this genome is comprised of 7 contigs containing 4373075 nucleotides. 
The input genome has 3486 existing coding features and 263 existing non-coding features.
Input genome has the following feature types:
	Non-coding assembly_gap          140 
	Non-coding gene                   52 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    3 
	Non-coding regulatory              7 
	Non-coding repeat_region          12 
	Non-coding tRNA                   46 
	Non-coding tmRNA                   1 
	gene                            3486 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3486 coding features and 263 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3486 
	Non-coding assembly_gap          140 
	Non-coding gene                   52 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    3 
	Non-coding regulatory              7 
	Non-coding repeat_region          12 
	Non-coding tRNA                   46 
	Non-coding tmRNA                   1 
Overall, the genes have 2023 distinct functions. 
The genes include 2008 genes with a SEED annotation ontology across 937 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000152145.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Caldibacillus debilis DSM 16016. 
The sequence for this genome is comprised of 40 contigs containing 3059517 nucleotides. 
The input genome has 2687 existing coding features and 182 existing non-coding features.
Input genome has the following feature types:
	Non-coding assembly_gap            2 
	Non-coding gene                   74 
	Non-coding misc_binding           10 
	Non-coding misc_feature            3 
	Non-coding ncRNA                   3 
	Non-coding rRNA                   12 
	Non-coding regulatory             13 
	Non-coding repeat_region           6 
	Non-coding tRNA                   58 
	Non-coding tmRNA                   1 
	gene                            2687 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2687 coding features and 182 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2687 
	Non-coding assembly_gap            2 
	Non-coding gene                   74 
	Non-coding misc_binding           10 
	Non-coding misc_feature            3 
	Non-coding ncRNA                   3 
	Non-coding rRNA                   12 
	Non-coding regulatory             13 
	Non-coding repeat_region           6 
	Non-coding tRNA                   58 
	Non-coding tmRNA                   1 
Overall, the genes have 1742 distinct functions. 
The genes include 1780 genes with a SEED annotation ontology across 970 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000383875.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Catellicoccus marimammalium M35/04/3. 
The sequence for this genome is comprised of 25 contigs containing 1285866 nucleotides. 
The input genome has 1203 existing coding features and 120 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   56 
	Non-coding misc_binding            2 
	Non-coding misc_feature            3 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    4 
	Non-coding regulatory              2 
	Non-coding repeat_region           1 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
	gene                            1203 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 1203 coding features and 120 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     1203 
	Non-coding gene                   56 
	Non-coding misc_binding            2 
	Non-coding misc_feature            3 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    4 
	Non-coding regulatory              2 
	Non-coding repeat_region           1 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
Overall, the genes have 870 distinct functions. 
The genes include 876 genes with a SEED annotation ontology across 550 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000313915.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Caldilinea aerophila DSM 14535 = NBRC 104270. 
The sequence for this genome is comprised of 1 contigs containing 5144873 nucleotides. 
The input genome has 4103 existing coding features and 126 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   57 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    6 
	Non-coding regulatory              8 
	Non-coding repeat_region           4 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
	gene                            4103 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 4103 coding features and 126 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     4103 
	Non-coding gene                   57 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    6 
	Non-coding regulatory              8 
	Non-coding repeat_region           4 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
Overall, the genes have 1736 distinct functions. 
The genes include 3039 genes with a SEED annotation ontology across 948 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000281175.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Herpetosiphon geysericola. 
The sequence for this genome is comprised of 46 contigs containing 6140412 nucleotides. 
The input genome has 5288 existing coding features and 126 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   52 
	Non-coding misc_feature            6 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    2 
	Non-coding regulatory             10 
	Non-coding repeat_region           6 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
	gene                            5288 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 5288 coding features and 126 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     5288 
	Non-coding gene                   52 
	Non-coding misc_feature            6 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    2 
	Non-coding regulatory             10 
	Non-coding repeat_region           6 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
Overall, the genes have 2286 distinct functions. 
The genes include 3362 genes with a SEED annotation ontology across 1030 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001306135.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: bacterium JKG1 Bacteria.. 
The sequence for this genome is comprised of 4 contigs containing 4475263 nucleotides. 
The input genome has 3924 existing coding features and 147 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   62 
	Non-coding misc_binding            1 
	Non-coding ncRNA                   1 
	Non-coding rRNA                    9 
	Non-coding regulatory             15 
	Non-coding repeat_region           7 
	Non-coding tRNA                   51 
	Non-coding tmRNA                   1 
	gene                            3924 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3924 coding features and 147 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3924 
	Non-coding gene                   62 
	Non-coding misc_binding            1 
	Non-coding ncRNA                   1 
	Non-coding rRNA                    9 
	Non-coding regulatory             15 
	Non-coding repeat_region           7 
	Non-coding tRNA                   51 
	Non-coding tmRNA                   1 
Overall, the genes have 1810 distinct functions. 
The genes include 2882 genes with a SEED annotation ontology across 1000 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000526415.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Ardenticatena maritima. 
The sequence for this genome is comprised of 308 contigs containing 3569367 nucleotides. 
The input genome has 3215 existing coding features and 153 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   66 
	Non-coding ncRNA                   2 
	Non-coding rRNA                   16 
	Non-coding regulatory             10 
	Non-coding repeat_region          11 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
	gene                            3215 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3215 coding features and 153 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3215 
	Non-coding gene                   66 
	Non-coding ncRNA                   2 
	Non-coding rRNA                   16 
	Non-coding regulatory             10 
	Non-coding repeat_region          11 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
Overall, the genes have 1322 distinct functions. 
The genes include 2595 genes with a SEED annotation ontology across 807 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001293545.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Roseiflexus castenholzii DSM 13941. 
The sequence for this genome is comprised of 1 contigs containing 5723298 nucleotides. 
The input genome has 4647 existing coding features and 129 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   57 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    6 
	Non-coding regulatory              9 
	Non-coding repeat_region           6 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
	gene                            4647 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 4647 coding features and 129 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     4647 
	Non-coding gene                   57 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    6 
	Non-coding regulatory              9 
	Non-coding repeat_region           6 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
Overall, the genes have 2964 distinct functions. 
The genes include 2058 genes with a SEED annotation ontology across 1082 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000017805.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Thermanaerothrix daxensis. 
The sequence for this genome is comprised of 6 contigs containing 3012066 nucleotides. 
The input genome has 2745 existing coding features and 122 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   54 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    3 
	Non-coding regulatory             10 
	Non-coding repeat_region           4 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
	gene                            2745 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2745 coding features and 122 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2745 
	Non-coding gene                   54 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    3 
	Non-coding regulatory             10 
	Non-coding repeat_region           4 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
Overall, the genes have 1338 distinct functions. 
The genes include 2094 genes with a SEED annotation ontology across 788 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001306145.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Carnobacterium jeotgali MS3. 
The sequence for this genome is comprised of 12 contigs containing 2518244 nucleotides. 
The input genome has 2348 existing coding features and 246 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                  109 
	Non-coding misc_binding           14 
	Non-coding misc_feature            3 
	Non-coding ncRNA                   3 
	Non-coding rRNA                   30 
	Non-coding regulatory             11 
	Non-coding tRNA                   75 
	Non-coding tmRNA                   1 
	gene                            2348 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2348 coding features and 246 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2348 
	Non-coding gene                  109 
	Non-coding misc_binding           14 
	Non-coding misc_feature            3 
	Non-coding ncRNA                   3 
	Non-coding rRNA                   30 
	Non-coding regulatory             11 
	Non-coding tRNA                   75 
	Non-coding tmRNA                   1 
Overall, the genes have 1447 distinct functions. 
The genes include 1583 genes with a SEED annotation ontology across 845 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000745125.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Anaerolinea thermophila UNI-1. 
The sequence for this genome is comprised of 1 contigs containing 3532378 nucleotides. 
The input genome has 3125 existing coding features and 336 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   59 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    6 
	Non-coding regulatory             10 
	Non-coding repeat_region         208 
	Non-coding tRNA                   49 
	Non-coding tmRNA                   1 
	gene                            3125 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3125 coding features and 336 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3125 
	Non-coding gene                   59 
	Non-coding ncRNA                   3 
	Non-coding rRNA                    6 
	Non-coding regulatory             10 
	Non-coding repeat_region         208 
	Non-coding tRNA                   49 
	Non-coding tmRNA                   1 
Overall, the genes have 1662 distinct functions. 
The genes include 1928 genes with a SEED annotation ontology across 808 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000199675.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Thermoflexus hugenholtzii JAD2. 
The sequence for this genome is comprised of 78 contigs containing 3216964 nucleotides. 
The input genome has 2899 existing coding features and 123 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   54 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    3 
	Non-coding regulatory              9 
	Non-coding repeat_region           6 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
	gene                            2899 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2899 coding features and 123 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2899 
	Non-coding gene                   54 
	Non-coding ncRNA                   2 
	Non-coding rRNA                    3 
	Non-coding regulatory              9 
	Non-coding repeat_region           6 
	Non-coding tRNA                   48 
	Non-coding tmRNA                   1 
Overall, the genes have 1252 distinct functions. 
The genes include 2325 genes with a SEED annotation ontology across 779 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_900187885.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Carnobacterium sp. CP1. 
The sequence for this genome is comprised of 2 contigs containing 2614401 nucleotides. 
The input genome has 2378 existing coding features and 222 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                  106 
	Non-coding ncRNA                   3 
	Non-coding rRNA                   25 
	Non-coding regulatory             10 
	Non-coding tRNA                   77 
	Non-coding tmRNA                   1 
	gene                            2378 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2378 coding features and 222 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2378 
	Non-coding gene                  106 
	Non-coding ncRNA                   3 
	Non-coding rRNA                   25 
	Non-coding regulatory             10 
	Non-coding tRNA                   77 
	Non-coding tmRNA                   1 
Overall, the genes have 1492 distinct functions. 
The genes include 1625 genes with a SEED annotation ontology across 874 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_001483965.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: JP3_7 C3 Thermofonsia. 
The sequence for this genome is comprised of 708 contigs containing 3379197 nucleotides. 
The input genome has 2977 existing coding features and 44 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   44 
	gene                            2977 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2977 coding features and 44 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2977 
	Non-coding gene                   44 
Overall, the genes have 1245 distinct functions. 
The genes include 2402 genes with a SEED annotation ontology across 772 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
JP3_7_genome succeeded!

Some RAST tools will not run unless the taxonomic domain is Archaea, Bacteria, or Virus. 
These tools include: call selenoproteins, call pyrroysoproteins, call crisprs, and call prophage phispy features.
You may not get the results you were expecting with your current domain of Unknown.
The RAST algorithm was applied to annotating an existing genome: 'Ca. Roseilinea gracile' YNP-MS-B-OTU-6, metagenome bin-6 (2.5kb). Bacteria.. 
The sequence for this genome is comprised of 439 contigs containing 2635638 nucleotides. 
The input genome has 2329 existing coding features and 44 existing non-coding features.
Input genome has the following feature types:
	Non-coding rRNA                    3 
	Non-coding tRNA                   41 
	gene                            2329 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 2329 coding features and 44 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     2329 
	Non-coding rRNA                    3 
	Non-coding tRNA                   41 
Overall, the genes have 1031 distinct functions. 
The genes include 1887 genes with a SEED annotation ontology across 657 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Ca_Roseilinea_gracile_MS_genome succeeded!

The RAST algorithm was applied to annotating an existing genome: Chloroflexus sp. MS-G. 
The sequence for this genome is comprised of 251 contigs containing 4770266 nucleotides. 
The input genome has 3915 existing coding features and 0 existing non-coding features.
NOTE: Older input genomes did not properly separate coding and non-coding features.
Input genome has the following feature types:
	Non-coding gene                   53 
	gene                            3659 
	pseudogene                       203 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3659 coding features and 256 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3659 
	Non-coding gene                   53 
	Non-coding pseudogene            203 
Overall, the genes have 2567 distinct functions. 
The genes include 1706 genes with a SEED annotation ontology across 1052 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Chloroflexus_sp._MS-G succeeded!

The RAST algorithm was applied to annotating an existing genome: Chloroflexus aggregans DSM 9485. 
The sequence for this genome is comprised of 1 contigs containing 4684931 nucleotides. 
The input genome has 3811 existing coding features and 134 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   58 
	Non-coding ncRNA                   1 
	Non-coding rRNA                    9 
	Non-coding regulatory              9 
	Non-coding repeat_region           9 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
	gene                            3811 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3811 coding features and 134 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3811 
	Non-coding gene                   58 
	Non-coding ncRNA                   1 
	Non-coding rRNA                    9 
	Non-coding regulatory              9 
	Non-coding repeat_region           9 
	Non-coding tRNA                   47 
	Non-coding tmRNA                   1 
Overall, the genes have 2650 distinct functions. 
The genes include 1751 genes with a SEED annotation ontology across 1061 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
GCF_000021945.1 succeeded!

The RAST algorithm was applied to annotating an existing genome: Ca. Roseilinea sp. NK_OTU-006. 
The sequence for this genome is comprised of 117 contigs containing 3642138 nucleotides. 
The input genome has 3087 existing coding features and 50 existing non-coding features.
Input genome has the following feature types:
	Non-coding gene                   50 
	gene                            3087 
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 3087 coding features and 50 non-coding features, 0 new features were called, of which 0 are non-coding.
Output genome has the following feature types:
	Coding gene                     3087 
	Non-coding gene                   50 
Overall, the genes have 1327 distinct functions. 
The genes include 2451 genes with a SEED annotation ontology across 800 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Ca.RoseilNK_genome succeeded!

Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604
  • annotation_report.Ca_Roseilinea_neighbors_annotated.Genomeset - Microbial Annotation Report
Annotate a Genome object with protein domains from widely used domain libraries.
This app completed without errors in 2h 16m 54s.
Objects
Created Object Name Type Description
Ca.Roseil_annotation_domains DomainAnnotation Domain Annotations
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/19/1 Running domain search against library 2959/18/1 Running domain search against library 2959/24/1 Running domain search against library 2959/25/1 Running domain search against library 2959/23/1 Running domain search against library 2959/7/7 Running domain search against library 2959/20/1 Running domain search against library 2959/17/1 Running domain search against library 2959/21/1 Running domain search against library 2959/22/1
Annotate a Genome object with protein domains from widely used domain libraries.
This app completed without errors in 1h 52m 10s.
Objects
Created Object Name Type Description
JP3_7_annotation_domains DomainAnnotation Domain Annotations
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/19/1 Running domain search against library 2959/18/1 Running domain search against library 2959/24/1 Running domain search against library 2959/25/1 Running domain search against library 2959/23/1 Running domain search against library 2959/7/7 Running domain search against library 2959/20/1 Running domain search against library 2959/17/1 Running domain search against library 2959/21/1 Running domain search against library 2959/22/1
Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.
This app completed without errors in 1d 6h 17m 18s.
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7
Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.
This app completed without errors in 14h 16m 43s.
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7

[5] Compute ANI

Allows users to compute fast whole-genome Average Nucleotide Identity (ANI) estimation.
This app completed without errors in 5m 4s.
Links

[6] Construct a Genome Tree

Add one or more genomes to a KBase species tree.
This app completed without errors in 3m 14s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604
  • Ca_Roseilinea_neighbors.newick
  • Ca_Roseilinea_neighbors-labels.newick
  • Ca_Roseilinea_neighbors.png
  • Ca_Roseilinea_neighbors.pdf

[7] Genome Visualization

Generate a map and annotations of circular genomes using CGView.
This app completed without errors in 4m 10s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/59604
  • KBase_derived_Ca.RoseilNK_genome.png
  • KBase_derived_Ca.RoseilNK_genome.jpg
  • KBase_derived_Ca.RoseilNK_genome.svg

[8] Construct Pangenome

Allows users to compute a pangenome from a set of individual genomes.
This app completed without errors in 20m 22s.
Objects
Created Object Name Type Description
Ca_Roseilinea_24genomes.Pangenome Pangenome Pangenome
Summary
Pangenome saved to joval:narrative_1586534178841/Ca_Roseilinea_24genomes.Pangenome
v1 - KBaseGenomes.Pangenome-4.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

[9] Build Metabolic Model

Generate a draft metabolic model based on an annotated genome.
This app completed without errors in 1m 36s.
Objects
Created Object Name Type Description
Ca.RoseilineaNK_metabolicmodel FBAModel FBAModel-12 Ca.RoseilineaNK_metabolicmodel
Ca.RoseilineaNK_metabolicmodel.gf.0 FBA FBA-13 Ca.RoseilineaNK_metabolicmodel.gf.0
Report
Output from Build Metabolic Model
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/59604

[10] View Genome Function Profiles

Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.
This app completed without errors in 2m 26s.
Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.
This app completed without errors in 3m 22s.
Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.
This app completed without errors in 2m 11s.
Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.
This app completed without errors in 3m 32s.
Examine the distribution of functional gene families for organisms in a phylogenetic SpeciesTree.
This app completed without errors in 3m 1s.

Apps

  1. Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
    • Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30: 2068 2069. doi:10.1093/bioinformatics/btu153
  2. Annotate Domains in a Genome
    • Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389
    • Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
    • Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344
    • Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234
    • Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922
    • Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949
    • Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129
    • Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043
    • Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631
  3. Annotate Domains in a GenomeSet
    • Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389
    • Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
    • Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344
    • Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234
    • Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922
    • Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949
    • Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129
    • Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043
    • Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631
  4. Annotate Multiple Microbial Assemblies
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34: D32 D36. doi:10.1093/nar/gkj014
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  5. Annotate Multiple Microbial Genomes
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34: D32 D36. doi:10.1093/nar/gkj014
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  6. Assess Genome Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  7. Assess Quality of Assemblies with QUAST - v4.4
    • [1] Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29: 1072 1075. doi:10.1093/bioinformatics/btt086
    • [2] Mikheenko A, Valin G, Prjibelski A, Saveliev V, Gurevich A. Icarus: visualizer for de novo assembly evaluation. Bioinformatics. 2016;32: 3321 3323. doi:10.1093/bioinformatics/btw379
  8. Build Metabolic Model
    • [1] Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28: 977 982. doi:10.1038/nbt.1672
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Latendresse M. Efficiently gap-filling reaction networks. BMC Bioinformatics. 2014;15: 225. doi:10.1186/1471-2105-15-225
    • [4] Dreyfuss JM, Zucker JD, Hood HM, Ocasio LR, Sachs MS, Galagan JE. Reconstruction and Validation of a Genome-Scale Metabolic Model for the Filamentous Fungus Neurospora crassa Using FARM. PLOS Computational Biology. 2013;9: e1003126. doi:10.1371/journal.pcbi.1003126
    • [5] Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003;5: 264 276.
  9. Circular Genome Visualization Tool
    no citations
  10. Compute ANI with FastANI
    • [1] Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries. 2017; doi:10.1101/225342
    • [2] Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57: 81 91. doi:10.1099/ijs.0.64483-0
    • FastANI module and source code:
  11. Compute Pangenome
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  12. Insert Genome Into SpeciesTree - v2.2.0
    • Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490
  13. View Function Profile for a Phylogenetic Tree - v1.4.0
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163