Generated March 31, 2022

Overview

KBase has powerful tools for metabolic modeling and comparative phylogenomics of microbial genomes that can be used for developing mechanistic understanding of functional interactions between species in microbial ecosystems. Essential to this process is obtaining high-quality genomes to annotate, either via cultivation or genome extraction from metagenome assembly. KBase has a suite of microbiome analysis Apps meant to be used in concert. After assembly and binning, high-quality bins are annotated and can then be used in Comparative Phylogenomics analyses (see Narrative here) and Metabolic Reconstruction and Community Interaction Modeling (see Narrative here).

Below we present the processing of a desert soil crust metagenome (4E) [Dynamic cyanobacterial response to hydration and dehydration in a desert biological soil crust, Isolation of a significant fraction of non-phototroph diversity from a desert biological soil crust, Linking soil biology and chemistry in biological soil crust using isolate exometabolomics].

Read Library from the SRA

Upload a data file (which may be compressed) from a web URL to your staging area.
This app completed without errors in 6m 48s.
Summary
Uploaded Files: 1 /SRR5855438.1
Import a FASTQ/SRA file into your Narrative as a Reads data object
This app completed without errors in 51m 13s.
Objects
Created Object Name Type Description
4E.Reads PairedEndLibrary Imported Reads
Links
Trim paired- or single-end Illumina reads with Trimmomatic.
This app completed without errors in 1h 36m 9s.
Objects
Created Object Name Type Description
4E-trim.Reads_paired PairedEndLibrary Trimmed Reads
4E-trim.Reads_unpaired_fwd SingleEndLibrary Trimmed Unpaired Forward Reads
4E-trim.Reads_unpaired_rev SingleEndLibrary Trimmed Unpaired Reverse Reads
Assemble metagenomic reads using the SPAdes assembler.
This app completed without errors in 1d 1h 23m 31s.
Objects
Created Object Name Type Description
4E-metaSPAdes.contigs Assembly Assembled contigs
Summary
Assembly saved to: dylan:narrative_1589842716069/4E-metaSPAdes.contigs Assembled into 17520 contigs. Avg Length: 6806.597260273973 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 17431 -- 2000.0 to 127969.2 bp 49 -- 127969.2 to 253938.4 bp 17 -- 253938.4 to 379907.6 bp 9 -- 379907.6 to 505876.8 bp 5 -- 505876.8 to 631846.0 bp 4 -- 631846.0 to 757815.2 bp 2 -- 757815.2 to 883784.4 bp 2 -- 883784.4 to 1009753.6 bp 0 -- 1009753.6 to 1135722.8 bp 1 -- 1135722.8 to 1261692.0 bp
Links
Assemble paired-end reads from single-cell or metagenomic sequencing technologies using the IDBA-UD assembler.
This app completed without errors in 17h 35m 54s.
Objects
Created Object Name Type Description
4E-IDBA.contigs Assembly Assembled contigs
Summary
Assembly saved to: dylan:narrative_1589842716069/4E-IDBA.contigs Assembled into 9065 contigs. Avg Length: 8660.55841147 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 8904 -- 2001.0 to 65523.3 bp 92 -- 65523.3 to 129045.6 bp 39 -- 129045.6 to 192567.9 bp 21 -- 192567.9 to 256090.2 bp 1 -- 256090.2 to 319612.5 bp 2 -- 319612.5 to 383134.8 bp 5 -- 383134.8 to 446657.1 bp 0 -- 446657.1 to 510179.4 bp 0 -- 510179.4 to 573701.7 bp 1 -- 573701.7 to 637224.0 bp
Links
Assemble metagenomic reads using the MEGAHIT assembler.
This app completed without errors in 19h 20m 24s.
Objects
Created Object Name Type Description
4E-MEGAHIT-large.assembly Assembly Assembled contigs
Summary
ContigSet saved to: dylan:narrative_1589842716069/4E-MEGAHIT-large.assembly Assembled into 20513 contigs. Avg Length: 6061.826841515137 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 20438 -- 2000.0 to 115455.4 bp 37 -- 115455.4 to 228910.8 bp 16 -- 228910.8 to 342366.19999999995 bp 9 -- 342366.19999999995 to 455821.6 bp 7 -- 455821.6 to 569277.0 bp 2 -- 569277.0 to 682732.3999999999 bp 3 -- 682732.3999999999 to 796187.7999999999 bp 0 -- 796187.7999999999 to 909643.2 bp 0 -- 909643.2 to 1023098.6 bp 1 -- 1023098.6 to 1136554.0 bp
Links
Assemble metagenomic reads using the MEGAHIT assembler.
This app completed without errors in 1d 1h 59m 3s.
Objects
Created Object Name Type Description
4E-MEGAHIT-sensitive.assembly Assembly Assembled contigs
Summary
ContigSet saved to: dylan:narrative_1589842716069/4E-MEGAHIT-sensitive.assembly Assembled into 19947 contigs. Avg Length: 6323.878578232316 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 19870 -- 2000.0 to 115507.4 bp 39 -- 115507.4 to 229014.8 bp 15 -- 229014.8 to 342522.19999999995 bp 12 -- 342522.19999999995 to 456029.6 bp 6 -- 456029.6 to 569537.0 bp 2 -- 569537.0 to 683044.3999999999 bp 2 -- 683044.3999999999 to 796551.7999999999 bp 0 -- 796551.7999999999 to 910059.2 bp 0 -- 910059.2 to 1023566.6 bp 1 -- 1023566.6 to 1137074.0 bp
Links
View distributions of contig characteristics for different assemblies.
This app completed without errors in 4m 41s.
Summary
ASSEMBLY STATS for 4E-metaSPAdes.contigs Len longest contig: 1261692 bp N50 (L50): 17370 (900) N75 (L75): 3505 (5786) N90 (L90): 2404 (12059) Num contigs >= 1000000 bp: 1 Num contigs >= 100000 bp: 120 Num contigs >= 10000 bp: 1463 Num contigs >= 1000 bp: 17520 Num contigs >= 500 bp: 17520 Num contigs >= 1 bp: 17520 Len contigs >= 1000000 bp: 1261692 bp Len contigs >= 100000 bp: 31736947 bp Len contigs >= 10000 bp: 66930628 bp Len contigs >= 1000 bp: 119251584 bp Len contigs >= 500 bp: 119251584 bp Len contigs >= 1 bp: 119251584 bp ASSEMBLY STATS for 4E-IDBA.contigs Len longest contig: 637224 bp N50 (L50): 19385 (663) N75 (L75): 5365 (2782) N90 (L90): 2915 (5841) Num contigs >= 1000000 bp: 0 Num contigs >= 100000 bp: 87 Num contigs >= 10000 bp: 1344 Num contigs >= 1000 bp: 9065 Num contigs >= 500 bp: 9065 Num contigs >= 1 bp: 9065 Len contigs >= 1000000 bp: 0 bp Len contigs >= 100000 bp: 16577896 bp Len contigs >= 10000 bp: 48613915 bp Len contigs >= 1000 bp: 78507962 bp Len contigs >= 500 bp: 78507962 bp Len contigs >= 1 bp: 78507962 bp ASSEMBLY STATS for 4E-MEGAHIT-sensitive.assembly Len longest contig: 1137074 bp N50 (L50): 11458 (1499) N75 (L75): 3389 (7349) N90 (L90): 2375 (14141) Num contigs >= 1000000 bp: 1 Num contigs >= 100000 bp: 93 Num contigs >= 10000 bp: 1756 Num contigs >= 1000 bp: 19947 Num contigs >= 500 bp: 19947 Num contigs >= 1 bp: 19947 Len contigs >= 1000000 bp: 1137074 bp Len contigs >= 100000 bp: 23922080 bp Len contigs >= 10000 bp: 65822607 bp Len contigs >= 1000 bp: 126142406 bp Len contigs >= 500 bp: 126142406 bp Len contigs >= 1 bp: 126142406 bp ASSEMBLY STATS for 4E-MEGAHIT-large.assembly Len longest contig: 1136554 bp N50 (L50): 9746 (1725) N75 (L75): 3274 (7958) N90 (L90): 2357 (14766) Num contigs >= 1000000 bp: 1 Num contigs >= 100000 bp: 92 Num contigs >= 10000 bp: 1669 Num contigs >= 1000 bp: 20513 Num contigs >= 500 bp: 20513 Num contigs >= 1 bp: 20513 Len contigs >= 1000000 bp: 1136554 bp Len contigs >= 100000 bp: 23766975 bp Len contigs >= 10000 bp: 61630462 bp Len contigs >= 1000 bp: 124346254 bp Len contigs >= 500 bp: 124346254 bp Len contigs >= 1 bp: 124346254 bp
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • key_plot.png
  • key_plot.pdf
  • cumulative_len_plot.png
  • cumulative_len_plot.pdf
  • sorted_contig_lengths.png
  • sorted_contig_lengths.pdf
  • histogram_figures.zip
Group assembled metagenomic contigs into lineages (Bins) using depth-of-coverage, nucleotide composition, and marker genes.
This app completed without errors in 1h 32m 60s.
Objects
Created Object Name Type Description
4E-metaSPAdes-MaxBin2-40markers.BinedContigs BinnedContigs BinnedContigs from MaxBin2
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • maxbin_result.zip - File(s) generated by MaxBin2 App
  • Bin.marker.pdf - Visualization of the marker by MaxBin2 App
Output from Bin Contigs using MaxBin2 - v2.2.4
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/62384
Bin metagenomic contigs
This app completed without errors in 1h 27m 44s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • metabat_result.zip - Files generated by MetaBAT2 App
Output from MetaBAT2 Contig Binning - v1.7
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/62384
Group assembled metagenomic contigs into lineages (Bins) using depth-of-coverage and nucleotide composition
This app completed without errors in 3h 6m 32s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • concoct_result.zip - Files generated by CONCOCT App
v1 - KBaseMetagenomes.BinnedContigs-1.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/62384
Optimize bacterial or archaeal genome bins using a dereplication, aggregation and scoring strategy
This app completed without errors in 13m 30s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • das_tool_result.zip - Files generated by kb_das_tool App
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/62384
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes. Creates a new BinnedContigs object with High Quality bins that pass user-defined thresholds for Completeness and Contamination.
This app completed without errors in 29m 32s.
Objects
Created Object Name Type Description
4E-metaSPAdes-MaxBin2_MetaBAT2_CONCOCT-DASTool-CheckM_HQ_90-5.BinnedContigs BinnedContigs HQ BinnedContigs 4E-metaSPAdes-MaxBin2_MetaBAT2_CONCOCT-DASTool-CheckM_HQ_90-5.BinnedContigs
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Extract a bin as an Assembly from a BinnedContig dataset
This app completed without errors in 6m 13s.
Objects
Created Object Name Type Description
4E-metaSPAdes-MaxBin2_MetaBAT2_CONCOCT_BowTie2_DASTool-HQ_90-5.AssemblySet AssemblySet Assembly set of extracted assemblies
Bin.002.fasta_assembly Assembly Assembly object of extracted contigs
Bin.011.fasta_assembly Assembly Assembly object of extracted contigs
Bin.008.fasta_assembly Assembly Assembly object of extracted contigs
Bin.009.fasta_assembly Assembly Assembly object of extracted contigs
Bin.015.fasta_assembly Assembly Assembly object of extracted contigs
Bin.003.fasta_assembly Assembly Assembly object of extracted contigs
Bin.013.fasta_assembly Assembly Assembly object of extracted contigs
Bin.010.fasta_assembly Assembly Assembly object of extracted contigs
Summary
Job Finished Generated Assembly Reference: 62384/251/1, 62384/252/1, 62384/253/1, 62384/254/1, 62384/255/1, 62384/256/1, 62384/151/2, 62384/257/1 Generated Assembly Set: 62384/258/1
Annotate bacterial or archaeal assemblies and/or assembly sets using RASTtk (Rapid Annotations using Subsystems Technology toolkit).
This app completed without errors in 39m 12s.
Objects
Created Object Name Type Description
Bin.010.fasta_assembly.RAST Genome Annotated genome
Bin.003.fasta_assembly.RAST Genome Annotated genome
Bin.015.fasta_assembly.RAST Genome Annotated genome
Bin.013.fasta_assembly.RAST Genome Annotated genome
Bin.009.fasta_assembly.RAST Genome Annotated genome
Bin.008.fasta_assembly.RAST Genome Annotated genome
Bin.002.fasta_assembly.RAST Genome Annotated genome
Bin.011.fasta_assembly.RAST Genome Annotated genome
4E-metaSPAdes-DASTool-HQ_90-5.GenomeSet GenomeSet Genome Set
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 94 contigs containing 5579351 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 6001 new features were called, of which 181 are non-coding.
Output genome has the following feature types:
	Coding gene                     5820 
	Non-coding repeat                153 
	Non-coding rna                    28 
Overall, the genes have 2632 distinct functions. 
The genes include 2321 genes with a SEED annotation ontology across 1292 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.010.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 15 contigs containing 3020098 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3278 new features were called, of which 63 are non-coding.
Output genome has the following feature types:
	Coding gene                     3215 
	Non-coding repeat                 39 
	Non-coding rna                    24 
Overall, the genes have 2204 distinct functions. 
The genes include 1374 genes with a SEED annotation ontology across 1135 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.003.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 12 contigs containing 4284298 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4202 new features were called, of which 44 are non-coding.
Output genome has the following feature types:
	Coding gene                     4158 
	Non-coding repeat                  8 
	Non-coding rna                    36 
Overall, the genes have 1585 distinct functions. 
The genes include 1660 genes with a SEED annotation ontology across 857 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.015.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 26 contigs containing 5937821 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 5672 new features were called, of which 54 are non-coding.
Output genome has the following feature types:
	Coding gene                     5618 
	Non-coding repeat                 27 
	Non-coding rna                    27 
Overall, the genes have 2649 distinct functions. 
The genes include 1991 genes with a SEED annotation ontology across 1201 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.013.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 279 contigs containing 2884547 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3178 new features were called, of which 35 are non-coding.
Output genome has the following feature types:
	Coding gene                     3143 
	Non-coding repeat                  2 
	Non-coding rna                    33 
Overall, the genes have 1555 distinct functions. 
The genes include 1515 genes with a SEED annotation ontology across 880 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.009.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 184 contigs containing 3621338 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3655 new features were called, of which 54 are non-coding.
Output genome has the following feature types:
	Coding gene                     3601 
	Non-coding repeat                 32 
	Non-coding rna                    22 
Overall, the genes have 1577 distinct functions. 
The genes include 1528 genes with a SEED annotation ontology across 862 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.008.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 54 contigs containing 5926646 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 5937 new features were called, of which 195 are non-coding.
Output genome has the following feature types:
	Coding gene                     5742 
	Non-coding crispr_array            3 
	Non-coding crispr_repeat          33 
	Non-coding crispr_spacer          30 
	Non-coding repeat                 91 
	Non-coding rna                    38 
Overall, the genes have 2634 distinct functions. 
The genes include 2146 genes with a SEED annotation ontology across 1214 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.002.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 41 contigs containing 5226120 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4911 new features were called, of which 52 are non-coding.
Output genome has the following feature types:
	Coding gene                     4859 
	Non-coding repeat                 27 
	Non-coding rna                    25 
Overall, the genes have 2114 distinct functions. 
The genes include 2043 genes with a SEED annotation ontology across 1069 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.011.fasta_assembly succeeded!

Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • annotation_report.4E-metaSPAdes-DASTool-HQ_90-5.GenomeSet - Microbial Annotation Report
v1 - KBaseSearch.GenomeSet-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/62384
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 2h 29m 8s.
Objects
Created Object Name Type Description
Bin.013.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
Bin.010.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
Bin.003.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
Bin.015.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
Bin.009.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
Bin.008.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
Bin.002.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
Bin.011.fasta_assembly.RAST Genome Taxonomy and taxon_assignment updated with GTDB
4E-metaSPAdes-DASTool-HQ_90-5.GenomeSet GenomeSet Taxonomy and taxon_assignment updated with GTDB
Links
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 44m 38s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.002.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.002.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.002.Five_RefSeq_prox.SpeciesTree.png
  • Bin.002.Five_RefSeq_prox.SpeciesTree.pdf
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 34m 16s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.003.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.003.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.003.Five_RefSeq_prox.SpeciesTree.png
  • Bin.003.Five_RefSeq_prox.SpeciesTree.pdf
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 42m 11s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.008.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.008.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.008.Five_RefSeq_prox.SpeciesTree.png
  • Bin.008.Five_RefSeq_prox.SpeciesTree.pdf
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 43m 5s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.009.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.009.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.009.Five_RefSeq_prox.SpeciesTree.png
  • Bin.009.Five_RefSeq_prox.SpeciesTree.pdf
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 42m 56s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.010.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.010.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.010.Five_RefSeq_prox.SpeciesTree.png
  • Bin.010.Five_RefSeq_prox.SpeciesTree.pdf
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 29m 2s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.011.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.011.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.011.Five_RefSeq_prox.SpeciesTree.png
  • Bin.011.Five_RefSeq_prox.SpeciesTree.pdf
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 39m 15s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.013.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.013.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.013.Five_RefSeq_prox.SpeciesTree.png
  • Bin.013.Five_RefSeq_prox.SpeciesTree.pdf
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 35m 2s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • Bin.015.Five_RefSeq_prox.SpeciesTree.newick
  • Bin.015.Five_RefSeq_prox.SpeciesTree-labels.newick
  • Bin.015.Five_RefSeq_prox.SpeciesTree.png
  • Bin.015.Five_RefSeq_prox.SpeciesTree.pdf
Use this App to combine multiple GenomeSets into a single consolidated set.
This app completed without errors in 6m 40s.
Objects
Created Object Name Type Description
Bin.002-015_Five_RefSeq_prox-plusBins.GenomeSet GenomeSet KButil_Merge_GenomeSets
Summary
genomes in output set Bin.002-015_Five_RefSeq_prox-plusBins.GenomeSet: 45
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/62384
Allows user to remove Genome(s) from a GenomeSet
This app completed without errors in 9m 26s.
Objects
Created Object Name Type Description
Bin.002-015_Five_RefSeq_prox-minusBins.GenomeSet GenomeSet KButil_Remove_Genomes_from_GenomeSet
Summary
genomes in output set Bin.002-015_Five_RefSeq_prox-minusBins.GenomeSet: 37
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/62384
Build Species Tree for your Microbial Genomes, optionally including Tree Skeleton of Phylum Exemplars
This app completed without errors in 3h 39m 47s.
Objects
Created Object Name Type Description
4E-Bins.002-015_HQ_90-5_plusRefSeqProximals_plusPhyla.SpeciesTree Tree Moab Desert Crust sample 4E - Species Tree + Skeleton + RefSeq Proximals
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • 4E-Bins.002-015_HQ_90-5_plusRefSeqProximals_plusPhyla.SpeciesTree.newick
  • 4E-Bins.002-015_HQ_90-5_plusRefSeqProximals_plusPhyla.SpeciesTree-labels.newick
  • 4E-Bins.002-015_HQ_90-5_plusRefSeqProximals_plusPhyla.SpeciesTree.png
  • 4E-Bins.002-015_HQ_90-5_plusRefSeqProximals_plusPhyla.SpeciesTree.pdf
Annotate your genome(s) with DRAM. Annotations will then be distilled to create an interactive functional summary per genome.
This app completed without errors in 5h 46m 37s.
Summary
Here are the results from your DRAM run.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • annotations.tsv - DRAM annotations in a tab separate table format
  • genes.faa - Genes as amino acids predicted by DRAM with brief annotations
  • product.tsv - DRAM product in tabular format
  • metabolism_summary.xlsx - DRAM metabolism summary tables
  • genome_stats.tsv - DRAM genome statistics table
View Genome summaries within a GenomeSet
This app completed without errors in 25m 10s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/62384
  • GenomeSet_summary.tsv

Released Apps

  1. Annotate Multiple Microbial Assemblies with RASTtk - v1.073
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698.
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  2. Assemble Reads with IDBA-UD - v1.1.3
    • Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28: 1420 1428. doi:10.1093/bioinformatics/bts174
  3. Assemble Reads with MEGAHIT v1.2.9
    • Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31: 1674 1676. doi:10.1093/bioinformatics/btv033
  4. Assemble Reads with metaSPAdes - v3.15.3
    • Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 2017; 27:824 834. doi: 10.1101/gr.213959.116
    • Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics. 2020 Jun;70(1):e102. doi: 10.1002/cpbi.102.
  5. Classify Microbes with GTDB-Tk - v1.7.0
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Rinke C, Chuvochina M, Mussig AJ, Chaumeil PA, Dav n AA, Waite DW, Whitman WB, Parks DH, and Hugenholtz P. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol. 2021 Jul;6(7):946-959. DOI:10.1038/s41564-021-00918-8
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195
  6. Compare Assembled Contig Distributions - v1.1.2
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  7. Extract Bins as Assemblies from BinnedContigs - v1.0.2
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  8. Filter Bins by Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  9. Insert Genome Into SpeciesTree - v2.2.0
    • Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490
  10. Merge GenomeSets - v1.7.4
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  11. Optimize Bacterial or Archaeal Binned Contigs using DAS Tool - v1.1.2
    • Sieber CMK, Probst AJ, Sharrar A, Thomas BC, Hess M, Tringe SG, Banfield JF. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. 2018; 3(7): 836-843. doi:10.1038/s41564-018-0171-1
    • DAS_Tool source:
    • Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nature Methods. 2015;12: 59-60. doi:10.1038/nmeth.3176
    • Pullseq:
    • R: A Language and Environment for Statistical Computing:
    • Ruby: A Programmers Best Friend:
  12. Remove Genomes from GenomeSet - v1.5.0
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  13. Trim Reads with Trimmomatic - v0.36
    • Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30: 2114 2120. doi:10.1093/bioinformatics/btu170
  14. Upload File to Staging from Web - v1.0.12
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163

Apps in Beta

  1. Annotate and Distill Genomes with DRAM
    • DRAM source code
    • DRAM documentation
    • DRAM publication
  2. Bin Contigs using CONCOCT - v1.1
    • Alneberg J, Bjarnason BS, de Bruijn I, Schirmer M, Quick J, Ijaz UZ, Lahti L, Loman NJ, Andersson AF, Quince C. Binning metagenomic contigs by coverage and composition. Nature Methods. 2014;11: 1144-1146. doi:10.1038/nmeth.3103
    • CONCOCT source:
  3. Bin Contigs using MaxBin2 - v2.2.4
    • Wu Y-W, Simmons BA, Singer SW. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics. 2016;32: 605 607. doi:10.1093/bioinformatics/btv638 (2) 1. Wu Y-W, Tang Y-H, Tringe SG, Simmons BA, Singer SW. MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome. 2014;2: 26. doi:10.1186/2049-2618-2-26
    • Wu Y-W, Tang Y-H, Tringe SG, Simmons BA, Singer SW. MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome. 2014;2: 26. doi:10.1186/2049-2618-2-26
    • Maxbin2 source:
    • Maxbin source:
  4. Build Microbial SpeciesTree - v1.6.0
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  5. Import FASTQ/SRA File as Reads from Staging Area
    no citations
  6. MetaBAT2 Contig Binning - v1.7
    • Kang DD, Froula J, Egan R, Wang Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ. 2015;3: e1165. doi:10.7717/peerj.1165
    • MetaBAT2 source:
  7. Summarize GenomeSet - v1.8.0
    no citations