Generated March 28, 2024
# Welcome to the Narrative
from IPython.display import IFrame
IFrame("https://www.kbase.us/narrative-welcome-cell/", width="100%", height="300px")
Out[1]:
Import a FASTQ/SRA file into your Narrative as a Reads data object
This app completed without errors in 8h 12m 11s.
Objects
Created Object Name Type Description
52512.1.364815.GACTATGC-GACTATGC.filter-METAGENOME.fastq.gz_reads PairedEndLibrary Imported Reads
Links
A quality control application for high throughput sequence data.
This app completed without errors in 4h 10m 3s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • 52512.1.364815.GACTATGC-GACTATGC.filter-METAGENOME.fastq.gz_reads_100341_2_2.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • 52512.1.364815.GACTATGC-GACTATGC.filter-METAGENOME.fastq.gz_reads_100341_2_2.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
Trim paired- or single-end Illumina reads with Trimmomatic.
This app completed without errors in 18h 58m 33s.
Objects
Created Object Name Type Description
T2_7F_trimmed_reads_paired PairedEndLibrary Trimmed Reads
T2_7F_trimmed_reads_unpaired_fwd SingleEndLibrary Trimmed Unpaired Forward Reads
A quality control application for high throughput sequence data.
This app completed without errors in 5h 11m 17s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • 7F_trimmed_reads_paired_100341_6_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • 7F_trimmed_reads_paired_100341_6_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
Allows users to perform taxonomic classification of shotgun metagenomic read data with Kaiju.
This app completed without errors in 5h 35m 28s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • kaiju_classifications.zip
  • kaiju_summaries.zip
  • krona_data.zip
  • stacked_bar_abundance_plots_PNG+PDF.zip
v1 - KBaseFile.PairedEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/100341
Assemble metagenomic reads using the MEGAHIT assembler.
This app completed without errors in 1d 16h 9m 36s.
Objects
Created Object Name Type Description
7FMEGAHIT.assembly Assembly Assembled contigs
Summary
ContigSet saved to: asengupta6:narrative_1633670037388/7FMEGAHIT.assembly Assembled into 166543 contigs. Avg Length: 3954.611331608053 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 165897 -- 2000.0 to 32297.0 bp 508 -- 32297.0 to 62594.0 bp 99 -- 62594.0 to 92891.0 bp 28 -- 92891.0 to 123188.0 bp 7 -- 123188.0 to 153485.0 bp 0 -- 153485.0 to 183782.0 bp 0 -- 183782.0 to 214079.0 bp 2 -- 214079.0 to 244376.0 bp 1 -- 244376.0 to 274673.0 bp 1 -- 274673.0 to 304970.0 bp
Links
Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.
This app completed without errors in 3m 43s.
Summary
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly 7FMEGAHIT.assembly # contigs (>= 0 bp) 166543 # contigs (>= 1000 bp) 166543 # contigs (>= 10000 bp) 6528 # contigs (>= 100000 bp) 25 # contigs (>= 1000000 bp) 0 Total length (>= 0 bp) 658612835 Total length (>= 1000 bp) 658612835 Total length (>= 10000 bp) 123845579 Total length (>= 100000 bp) 3540003 Total length (>= 1000000 bp) 0 # contigs 166543 Largest contig 304970 Total length 658612835 GC (%) 60.97 N50 3974 N75 2655 L50 42609 L75 94334 # N's per 100 kbp 0.00
Links
Bin metagenomic contigs
This app completed without errors in 6h 36m 12s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • metabat_result.zip - Files generated by MetaBAT2 App
Output from MetaBAT2 Contig Binning - v1.7
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/100341
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 2h 11m 30s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes. Creates a new BinnedContigs object with High Quality bins that pass user-defined thresholds for Completeness and Contamination.
This app completed without errors in 2h 19m 16s.
Objects
Created Object Name Type Description
7FCheckM_HQ_bins.BinnedContigs BinnedContigs HQ BinnedContigs 7FCheckM_HQ_bins.BinnedContigs
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Extract a bin as an Assembly from a BinnedContig dataset
This app completed without errors in 3h 46m 3s.
Objects
Created Object Name Type Description
7F_extracted_bins.AssemblySet AssemblySet Assembly set of extracted assemblies
Bin.076.fasta_assembly Assembly Assembly object of extracted contigs
Bin.035.fasta_assembly Assembly Assembly object of extracted contigs
Bin.013.fasta_assembly Assembly Assembly object of extracted contigs
Bin.047.fasta_assembly Assembly Assembly object of extracted contigs
Bin.046.fasta_assembly Assembly Assembly object of extracted contigs
Bin.068.fasta_assembly Assembly Assembly object of extracted contigs
Bin.023.fasta_assembly Assembly Assembly object of extracted contigs
Bin.073.fasta_assembly Assembly Assembly object of extracted contigs
Bin.008.fasta_assembly Assembly Assembly object of extracted contigs
Bin.083.fasta_assembly Assembly Assembly object of extracted contigs
Bin.063.fasta_assembly Assembly Assembly object of extracted contigs
Bin.074.fasta_assembly Assembly Assembly object of extracted contigs
Bin.065.fasta_assembly Assembly Assembly object of extracted contigs
Bin.094.fasta_assembly Assembly Assembly object of extracted contigs
Bin.082.fasta_assembly Assembly Assembly object of extracted contigs
Bin.026.fasta_assembly Assembly Assembly object of extracted contigs
Bin.020.fasta_assembly Assembly Assembly object of extracted contigs
Bin.067.fasta_assembly Assembly Assembly object of extracted contigs
Bin.024.fasta_assembly Assembly Assembly object of extracted contigs
Bin.006.fasta_assembly Assembly Assembly object of extracted contigs
Bin.079.fasta_assembly Assembly Assembly object of extracted contigs
Bin.069.fasta_assembly Assembly Assembly object of extracted contigs
Bin.089.fasta_assembly Assembly Assembly object of extracted contigs
Bin.084.fasta_assembly Assembly Assembly object of extracted contigs
Bin.077.fasta_assembly Assembly Assembly object of extracted contigs
Bin.057.fasta_assembly Assembly Assembly object of extracted contigs
Bin.019.fasta_assembly Assembly Assembly object of extracted contigs
Bin.095.fasta_assembly Assembly Assembly object of extracted contigs
Bin.040.fasta_assembly Assembly Assembly object of extracted contigs
Bin.015.fasta_assembly Assembly Assembly object of extracted contigs
Bin.052.fasta_assembly Assembly Assembly object of extracted contigs
Summary
Job Finished Generated Assembly Reference: 100341/23/1, 100341/24/1, 100341/25/1, 100341/26/1, 100341/27/1, 100341/28/1, 100341/29/1, 100341/30/1, 100341/31/1, 100341/32/1, 100341/33/1, 100341/34/1, 100341/35/1, 100341/36/1, 100341/37/1, 100341/38/1, 100341/39/1, 100341/40/1, 100341/41/1, 100341/42/1, 100341/43/1, 100341/44/1, 100341/45/1, 100341/46/1, 100341/47/1, 100341/48/1, 100341/49/1, 100341/50/1, 100341/51/1, 100341/52/1, 100341/53/1 Generated Assembly Set: 100341/54/1
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 6h 42m 1s.
Links
Annotate your assembly with DRAM. Annotations will then be distilled to create an interactive functional summary per assembly.
This app completed without errors in 19h 52m 55s.
Summary
Here are the results from your DRAM run.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • annotations.tsv - DRAM annotations in a tab separate table format
  • genes.fna - Genes as nucleotides predicted by DRAM with brief annotations
  • genes.faa - Genes as amino acids predicted by DRAM with brief annotations
  • genes.gff - GFF file of all DRAM annotations
  • rrnas.tsv - Tab separated table of rRNAs as detected by barrnap
  • trnas.tsv - Tab separated table of tRNAs as detected by tRNAscan-SE
  • genbank.tar.gz - Compressed folder of output genbank files
  • product.tsv - DRAM product in tabular format
  • metabolism_summary.xlsx - DRAM metabolism summary tables
  • genome_stats.tsv - DRAM genome statistics table
Annotate Metagenome Assembly and Re-annotate Metagenome with RASTtk (Rapid Annotations using Subsystems Technology toolkit).
This app completed without errors in 5h 53m 35s.
Objects
Created Object Name Type Description
7F_RAST_reannotatedgenome AnnotatedMetagenomeAssembly RAST re-annotated metagenome
Summary
Genome Ref: 100341/58/1 Genome type: KBaseMetagenomes.AnnotatedMetagenomeAssembly-1.0 Number of contigs: 166543 Number of features: 1286150 Number of unique function roles: 14495 Number of genes: 409057
Links
Annotate bacterial or archaeal assemblies and/or assembly sets using RASTtk (Rapid Annotations using Subsystems Technology toolkit).
This app completed without errors in 1h 45m 22s.
Objects
Created Object Name Type Description
Bin.067.fasta_assembly.RAST Genome RAST annotation
Bin.024.fasta_assembly.RAST Genome RAST annotation
Bin.023.fasta_assembly.RAST Genome RAST annotation
Bin.068.fasta_assembly.RAST Genome RAST annotation
Bin.076.fasta_assembly.RAST Genome RAST annotation
Bin.084.fasta_assembly.RAST Genome RAST annotation
Bin.077.fasta_assembly.RAST Genome RAST annotation
Bin.057.fasta_assembly.RAST Genome RAST annotation
Bin.019.fasta_assembly.RAST Genome RAST annotation
Bin.046.fasta_assembly.RAST Genome RAST annotation
Bin.006.fasta_assembly.RAST Genome RAST annotation
Bin.047.fasta_assembly.RAST Genome RAST annotation
Bin.079.fasta_assembly.RAST Genome RAST annotation
Bin.013.fasta_assembly.RAST Genome RAST annotation
Bin.069.fasta_assembly.RAST Genome RAST annotation
Bin.035.fasta_assembly.RAST Genome RAST annotation
Bin.089.fasta_assembly.RAST Genome RAST annotation
Bin.095.fasta_assembly.RAST Genome RAST annotation
Bin.040.fasta_assembly.RAST Genome RAST annotation
Bin.015.fasta_assembly.RAST Genome RAST annotation
Bin.073.fasta_assembly.RAST Genome RAST annotation
Bin.020.fasta_assembly.RAST Genome RAST annotation
Bin.074.fasta_assembly.RAST Genome RAST annotation
Bin.063.fasta_assembly.RAST Genome RAST annotation
Bin.083.fasta_assembly.RAST Genome RAST annotation
Bin.082.fasta_assembly.RAST Genome RAST annotation
Bin.008.fasta_assembly.RAST Genome RAST annotation
Bin.026.fasta_assembly.RAST Genome RAST annotation
Bin.052.fasta_assembly.RAST Genome RAST annotation
Bin.094.fasta_assembly.RAST Genome RAST annotation
Bin.065.fasta_assembly.RAST Genome RAST annotation
7F_GenomeSet GenomeSet Genome Set
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 212 contigs containing 1528443 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 1892 new features were called, of which 85 are non-coding.
Output genome has the following feature types:
	Coding gene                     1807 
	Non-coding repeat                 48 
	Non-coding rna                    37 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.067.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 156 contigs containing 1037457 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 1343 new features were called, of which 80 are non-coding.
Output genome has the following feature types:
	Coding gene                     1263 
	Non-coding repeat                 61 
	Non-coding rna                    19 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.024.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 558 contigs containing 2581959 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3116 new features were called, of which 117 are non-coding.
Output genome has the following feature types:
	Coding gene                     2999 
	Non-coding repeat                 88 
	Non-coding rna                    29 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.023.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 158 contigs containing 3313400 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3060 new features were called, of which 181 are non-coding.
Output genome has the following feature types:
	Coding gene                     2879 
	Non-coding repeat                134 
	Non-coding rna                    47 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.068.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 108 contigs containing 3198161 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 2984 new features were called, of which 158 are non-coding.
Output genome has the following feature types:
	Coding gene                     2826 
	Non-coding repeat                107 
	Non-coding rna                    51 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.076.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 126 contigs containing 1955535 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 2572 new features were called, of which 132 are non-coding.
Output genome has the following feature types:
	Coding gene                     2440 
	Non-coding repeat                 90 
	Non-coding rna                    42 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.084.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 366 contigs containing 3933461 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3888 new features were called, of which 101 are non-coding.
Output genome has the following feature types:
	Coding gene                     3787 
	Non-coding repeat                 61 
	Non-coding rna                    40 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.077.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 405 contigs containing 2834732 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3363 new features were called, of which 108 are non-coding.
Output genome has the following feature types:
	Coding gene                     3255 
	Non-coding repeat                 70 
	Non-coding rna                    38 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.057.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 78 contigs containing 3405059 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3386 new features were called, of which 81 are non-coding.
Output genome has the following feature types:
	Coding gene                     3305 
	Non-coding repeat                 38 
	Non-coding rna                    43 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.019.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 133 contigs containing 1778664 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 2387 new features were called, of which 152 are non-coding.
Output genome has the following feature types:
	Coding gene                     2235 
	Non-coding repeat                112 
	Non-coding rna                    40 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.046.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 664 contigs containing 3391019 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4528 new features were called, of which 213 are non-coding.
Output genome has the following feature types:
	Coding gene                     4315 
	Non-coding repeat                192 
	Non-coding rna                    21 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.006.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 874 contigs containing 4277591 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4668 new features were called, of which 71 are non-coding.
Output genome has the following feature types:
	Coding gene                     4597 
	Non-coding repeat                 51 
	Non-coding rna                    20 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.047.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 541 contigs containing 2617763 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3313 new features were called, of which 280 are non-coding.
Output genome has the following feature types:
	Coding gene                     3033 
	Non-coding repeat                263 
	Non-coding rna                    17 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.079.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 144 contigs containing 2341866 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 2176 new features were called, of which 52 are non-coding.
Output genome has the following feature types:
	Coding gene                     2124 
	Non-coding repeat                 31 
	Non-coding rna                    21 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.013.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 357 contigs containing 4669444 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 5078 new features were called, of which 322 are non-coding.
Output genome has the following feature types:
	Coding gene                     4756 
	Non-coding repeat                285 
	Non-coding rna                    37 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.069.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 216 contigs containing 1653379 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 1992 new features were called, of which 188 are non-coding.
Output genome has the following feature types:
	Coding gene                     1804 
	Non-coding repeat                152 
	Non-coding rna                    36 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.035.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 671 contigs containing 4243538 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 5875 new features were called, of which 623 are non-coding.
Output genome has the following feature types:
	Coding gene                     5252 
	Non-coding repeat                592 
	Non-coding rna                    31 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.089.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 219 contigs containing 3351861 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4000 new features were called, of which 291 are non-coding.
Output genome has the following feature types:
	Coding gene                     3709 
	Non-coding repeat                250 
	Non-coding rna                    41 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.095.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 406 contigs containing 4748972 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4951 new features were called, of which 231 are non-coding.
Output genome has the following feature types:
	Coding gene                     4720 
	Non-coding repeat                191 
	Non-coding rna                    40 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.040.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 567 contigs containing 2817950 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3154 new features were called, of which 60 are non-coding.
Output genome has the following feature types:
	Coding gene                     3094 
	Non-coding repeat                 23 
	Non-coding rna                    37 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.015.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 643 contigs containing 2773501 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3030 new features were called, of which 51 are non-coding.
Output genome has the following feature types:
	Coding gene                     2979 
	Non-coding repeat                 29 
	Non-coding rna                    22 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.073.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 548 contigs containing 2838860 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3199 new features were called, of which 128 are non-coding.
Output genome has the following feature types:
	Coding gene                     3071 
	Non-coding repeat                 98 
	Non-coding rna                    30 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.020.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 326 contigs containing 3262289 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3690 new features were called, of which 132 are non-coding.
Output genome has the following feature types:
	Coding gene                     3558 
	Non-coding repeat                103 
	Non-coding rna                    29 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.074.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 459 contigs containing 3963132 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 4602 new features were called, of which 224 are non-coding.
Output genome has the following feature types:
	Coding gene                     4378 
	Non-coding repeat                185 
	Non-coding rna                    39 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.063.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 502 contigs containing 3562135 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3853 new features were called, of which 76 are non-coding.
Output genome has the following feature types:
	Coding gene                     3777 
	Non-coding repeat                 49 
	Non-coding rna                    27 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.083.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 351 contigs containing 2781677 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3071 new features were called, of which 175 are non-coding.
Output genome has the following feature types:
	Coding gene                     2896 
	Non-coding repeat                134 
	Non-coding rna                    41 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.082.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 381 contigs containing 2372894 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 2776 new features were called, of which 192 are non-coding.
Output genome has the following feature types:
	Coding gene                     2584 
	Non-coding repeat                166 
	Non-coding rna                    26 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.008.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 334 contigs containing 2015743 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 2527 new features were called, of which 160 are non-coding.
Output genome has the following feature types:
	Coding gene                     2367 
	Non-coding repeat                138 
	Non-coding rna                    22 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.026.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 240 contigs containing 1413175 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 1748 new features were called, of which 121 are non-coding.
Output genome has the following feature types:
	Coding gene                     1627 
	Non-coding repeat                 98 
	Non-coding rna                    23 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.052.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 413 contigs containing 2684829 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 3113 new features were called, of which 241 are non-coding.
Output genome has the following feature types:
	Coding gene                     2872 
	Non-coding repeat                211 
	Non-coding rna                    30 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.094.fasta_assembly succeeded!

The RAST algorithm was applied to annotating a genome sequence comprised of 84 contigs containing 2694299 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 2701 new features were called, of which 51 are non-coding.
Output genome has the following feature types:
	Coding gene                     2650 
	Non-coding repeat                 38 
	Non-coding rna                    13 
Overall, the genes have 0 distinct functions. 
The genes include 0 genes with a SEED annotation ontology across 0 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Bin.065.fasta_assembly succeeded!

Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • annotation_report.7F_GenomeSet - Microbial Annotation Report
Search for matches to HMMs of environmental bioelement cycling families using HMMER 3
This app produced errors in 1m 52s.
No output found.
Search for matches to HMMs of environmental bioelement cycling families using HMMER 3
This app completed without errors in 3m 51s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/100341
  • HMMER_EnvBioelement_Search.TAB.zip

Released Apps

  1. Assemble Reads with MEGAHIT v1.2.9
    • Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31: 1674 1676. doi:10.1093/bioinformatics/btv033
  2. Assess Genome Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  3. Assess Quality of Assemblies with QUAST - v4.4
    • [1] Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29: 1072 1075. doi:10.1093/bioinformatics/btt086
    • [2] Mikheenko A, Valin G, Prjibelski A, Saveliev V, Gurevich A. Icarus: visualizer for de novo assembly evaluation. Bioinformatics. 2016;32: 3321 3323. doi:10.1093/bioinformatics/btw379
  4. Assess Read Quality with FastQC - v0.12.1
    • FastQC source: Bioinformatics Group at the Babraham Institute, UK.
  5. Classify Microbes with GTDB-Tk - v2.3.2
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics, Volume 38, Issue 23, 1 December 2022, Pages 5315 5316. DOI: https://doi.org/10.1093/bioinformatics/btac672
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Donovan H Parks, Maria Chuvochina, Christian Rinke, Aaron J Mussig, Pierre-Alain Chaumeil, Philip Hugenholtz. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Research, Volume 50, Issue D1, 7 January 2022, Pages D785 D794. DOI: https://doi.org/10.1093/nar/gkab776
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Rinke C, Chuvochina M, Mussig AJ, Chaumeil PA, Dav n AA, Waite DW, Whitman WB, Parks DH, and Hugenholtz P. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol. 2021 Jul;6(7):946-959. DOI:10.1038/s41564-021-00918-8
    • Chivian D, Jungbluth SP, Dehal PS, Wood-Charlson EM, Canon RS, Allen BH, Clark MM, Gu T, Land ML, Price GA, Riehl WJ, Sneddon MW, Sutormin R, Zhang Q, Cottingham RW, Henry CS, Arkin AP. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat Protoc. 2023 Jan;18(1):208-238. doi: 10.1038/s41596-022-00747-x
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195
    • Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy AM. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 2016 Jun 20;17(1):132. DOI: 10.1186/s13059-016-0997-x
  6. Classify Taxonomy of Metagenomic Reads with Kaiju - v1.9.0
    • Chivian D, et al. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat Protoc. 2023 Jan;18(1):208-238. doi: 10.1038/s41596-022-00747-x
    • Menzel P, Ng KL, Krogh A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat Commun. 2016;7: 11257. doi:10.1038/ncomms11257
    • Ondov BD, Bergman NH, Phillippy AM. Interactive metagenomic visualization in a Web browser. BMC Bioinformatics. 2011;12: 385. doi:10.1186/1471-2105-12-385
    • Kaiju Homepage:
    • Kaiju DBs from:
    • Github for Kaiju:
    • Krona homepage:
    • Github for Krona:
  7. Extract Bins as Assemblies from BinnedContigs - v1.0.2
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  8. Filter Bins by Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  9. Import FASTQ/SRA File as Reads from Staging Area
    no citations
  10. MetaBAT2 Contig Binning - v1.7
    • Kang DD, Froula J, Egan R, Wang Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ. 2015;3: e1165. doi:10.7717/peerj.1165
    • MetaBAT2 source:
  11. Search with HMMs of Environmental Bioelement families - v1
    • Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
    • Anatharaman K, Brown CT, Hug LA, Sharon I, Castelle CJ, Probst AJ, Thomas BC, Singh A, Wilkins MJ, Karaoz U, Brodie EL, Williams KH, Hubbard SS, Banfield JF. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nature Communications. 2016;7: 13219. doi:10.1038/ncomms13219
    • HMMER v3.3 source:
  12. Trim Reads with Trimmomatic - v0.36
    • Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30: 2114 2120. doi:10.1093/bioinformatics/btu170

Apps in Beta

  1. Annotate and Distill Assemblies with DRAM
    • DRAM source code
    • DRAM documentation
    • DRAM publication
  2. Annotate Metagenome Assembly and Re-annotate Metagenome with RASTtk - v1.073
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698.
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  3. Annotate Multiple Microbial Assemblies with RASTtk - v1.073
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698.
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  4. Search with HMMs of Environmental Bioelement families - v1
    • Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
    • Anatharaman K, Brown CT, Hug LA, Sharon I, Castelle CJ, Probst AJ, Thomas BC, Singh A, Wilkins MJ, Karaoz U, Brodie EL, Williams KH, Hubbard SS, Banfield JF. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nature Communications. 2016;7: 13219. doi:10.1038/ncomms13219
    • HMMER v3.3 source: