Generated August 20, 2024
from biokbase.narrative.jobs.appmanager import AppManager
AppManager().run_app_batch(
    [{
        "app_id": "kb_uploadmethods/import_fastq_noninterleaved_as_reads_from_staging",
        "tag": "release",
        "version": "5b9346463df88a422ff5d4f4cba421679f63c73f",
        "params": [{
            "fastq_fwd_staging_file_name": "Unknown_138_S146_R1_001.fastq.gz",
            "fastq_rev_staging_file_name": "Unknown_138_S146_R2_001.fastq.gz",
            "name": "Unknown138.2"
        }],
        "shared_params": {
            "insert_size_mean": None,
            "insert_size_std_dev": None,
            "read_orientation_outward": 0,
            "sequencing_tech": "Illumina",
            "single_genome": 1
        }
    }],
    cell_id="59fbfa2f-fb8a-46cb-a8b8-a00b3c28d26f",
    run_id="ddd62039-7da7-41cb-a864-ec0c060a0026"
)
A quality control application for high throughput sequence data.
This app completed without errors in 1m 39s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/172262
  • Unknown138.2_172262_38_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • Unknown138.2_172262_38_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
v1 - KBaseFile.PairedEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
Trim paired- or single-end Illumina reads with Trimmomatic.
This app completed without errors in 4m 29s.
Objects
Created Object Name Type Description
Unknown138trimmomatic_paired PairedEndLibrary Trimmed Reads
Unknown138trimmomatic_unpaired_fwd SingleEndLibrary Trimmed Unpaired Forward Reads
Unknown138trimmomatic_unpaired_rev SingleEndLibrary Trimmed Unpaired Reverse Reads
v1 - KBaseFile.PairedEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
v1 - KBaseFile.PairedEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
A quality control application for high throughput sequence data.
This app completed without errors in 35s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/172262
  • Unknown138trimmomatic_unpaired_fwd_172262_42_1.single_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
A quality control application for high throughput sequence data.
This app completed without errors in 1m 55s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/172262
  • Unknown138trimmomatic_paired_172262_41_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • Unknown138trimmomatic_paired_172262_41_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
A quality control application for high throughput sequence data.
This app completed without errors in 43s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/172262
  • Unknown138trimmomatic_unpaired_rev_172262_43_1.single_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
v1 - KBaseFile.PairedEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
v1 - KBaseFile.PairedEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
Assemble paired-end reads from single-cell or metagenomic sequencing technologies using the IDBA-UD assembler.
This app completed without errors in 7m 8s.
Objects
Created Object Name Type Description
IDBA.contigs Assembly Assembled contigs
Summary
Assembly saved to: annamcloon:narrative_1709055304717/IDBA.contigs Assembled into 81 contigs. Avg Length: 71464.24691358025 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 67 -- 2206.0 to 87915.6 bp 3 -- 87915.6 to 173625.2 bp 4 -- 173625.2 to 259334.80000000002 bp 2 -- 259334.80000000002 to 345044.4 bp 2 -- 345044.4 to 430754.0 bp 2 -- 430754.0 to 516463.60000000003 bp 0 -- 516463.60000000003 to 602173.2000000001 bp 0 -- 602173.2000000001 to 687882.8 bp 0 -- 687882.8 to 773592.4 bp 1 -- 773592.4 to 859302.0 bp
Links
Assemble paired-end reads from single-cell or metagenomic sequencing technologies using the IDBA-UD assembler.
This app completed without errors in 6m 56s.
Objects
Created Object Name Type Description
unknown138_trimmedIDBA.contigs Assembly Assembled contigs
Summary
Assembly saved to: annamcloon:narrative_1709055304717/unknown138_trimmedIDBA.contigs Assembled into 84 contigs. Avg Length: 69396.65476190476 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 76 -- 514.0 to 112054.9 bp 1 -- 112054.9 to 223595.8 bp 1 -- 223595.8 to 335136.69999999995 bp 2 -- 335136.69999999995 to 446677.6 bp 0 -- 446677.6 to 558218.5 bp 0 -- 558218.5 to 669759.3999999999 bp 1 -- 669759.3999999999 to 781300.2999999999 bp 2 -- 781300.2999999999 to 892841.2 bp 0 -- 892841.2 to 1004382.1 bp 1 -- 1004382.1 to 1115923.0 bp
Links
Assemble reads using the SPAdes assembler.
This app completed without errors in 15m 19s.
Objects
Created Object Name Type Description
unknown138trimmed_SPAdes.Assembly Assembly Assembled contigs
Summary
Assembly saved to: annamcloon:narrative_1709055304717/unknown138trimmed_SPAdes.Assembly Assembled into 41 contigs. Avg Length: 142758.63414634147 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 36 -- 522.0 to 167528.7 bp 0 -- 167528.7 to 334535.4 bp 2 -- 334535.4 to 501542.10000000003 bp 0 -- 501542.10000000003 to 668548.8 bp 0 -- 668548.8 to 835555.5 bp 1 -- 835555.5 to 1002562.2000000001 bp 0 -- 1002562.2000000001 to 1169568.9000000001 bp 1 -- 1169568.9000000001 to 1336575.6 bp 0 -- 1336575.6 to 1503582.3 bp 1 -- 1503582.3 to 1670589.0 bp
Links
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB)
This app produced errors in 12m 6s.
No output found.
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB)
This app produced errors in 12m 54s.
No output found.
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 35m 52s.
Links
Allows users to create an AssemblySet object.
This app completed without errors in 12s.
Objects
Created Object Name Type Description
AssemblySet138.2_SPAdes AssemblySet KButil_Build_AssemblySet
Summary
assembly objs in output set AssemblySet138.2_SPAdes: 1
v1 - KBaseSets.AssemblySet-1.2
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
v1 - KBaseSets.AssemblySet-1.2
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
Annotate or re-annotate genome/assembly using RASTtk (Rapid Annotations using Subsystems Technology toolkit).
This app completed without errors in 4m 56s.
Objects
Created Object Name Type Description
unknown138_SPAdes_RASTtk Genome RAST re-annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 41 contigs containing 5853104 nucleotides. No initial gene calls were provided. Standard features were called using: glimmer3; prodigal. A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr. The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity. In addition to the remaining original 0 coding features and 0 non-coding features, 6495 new features were called, of which 147 are non-coding. Output genome has the following feature types: Coding gene 6348 Non-coding repeat 110 Non-coding rna 37 The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Links
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 2m 58s.
Objects
Created Object Name Type Description
unknown138_SPAdes_Prokka Genome Annotated Genome
Summary
Annotated Genome saved to: annamcloon:narrative_1709055304717/unknown138_SPAdes_Prokka Number of genes predicted: 6063 Number of protein coding genes: 6029 Number of genes with non-hypothetical function: 3407 Number of genes with EC-number: 1331 Number of genes with Seed Subsystem Ontology: 0 Average protein length: 264 aa.
v1 - KBaseGenomes.Genome-11.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
Output from Annotate Assembly and Re-annotate Genomes with Prokka - v1.14.5
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
Annotate your assemblies, isolate genomes, or MAGs with DRAM and distill resulting annotations to create an interactive functional summary per genome or assembly. Use for KBase assembly objects.
This app completed without errors in 26m 22s.
Objects
Created Object Name Type Description
unknown138trimmed_SPAdes.Assembly_DRAM Genome Annotated Genome
DRAM_SPAdestrimmed_unknown138 GenomeSet DRAM with SPAdes trimmed assembly of 138
Summary
Here are the results from your DRAM run.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/172262
  • annotations.tsv - DRAM annotations in a tab separate table format
  • genes.fna - Genes as nucleotides predicted by DRAM with brief annotations
  • genes.faa - Genes as amino acids predicted by DRAM with brief annotations
  • genes.gff - GFF file of all DRAM annotations
  • rrnas.tsv - Tab separated table of rRNAs as detected by barrnap
  • trnas.tsv - Tab separated table of tRNAs as detected by tRNAscan-SE
  • genbank.tar.gz - Compressed folder of output genbank files
  • product.tsv - DRAM product in tabular format
  • metabolism_summary.xlsx - DRAM metabolism summary tables
  • genome_stats.tsv - DRAM genome statistics table
v1 - KBaseGenomes.Genome-11.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
v1 - KBaseGenomes.Genome-11.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262
v1 - KBaseGenomes.Genome-11.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/172262

Released Apps

  1. Annotate and Distill Assemblies with DRAM
    • DRAM source code
    • DRAM documentation
    • DRAM Tutorial
    • DRAM publication
  2. Annotate Assembly and Re-annotate Genomes with Prokka - v1.14.5
    • Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30: 2068 2069. doi:10.1093/bioinformatics/btu153
  3. Annotate Genome/Assembly with RASTtk - v1.073
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698.
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  4. Assemble Reads with IDBA-UD - v1.1.3
    • Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28: 1420 1428. doi:10.1093/bioinformatics/bts174
  5. Assemble Reads with SPAdes - v3.15.3
    • Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. Journal of Computational Biology. 2012;19: 455-477. doi: 10.1089/cmb.2012.0021
    • Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics. 2020 Jun;70(1):e102. doi: 10.1002/cpbi.102.
  6. Assess Read Quality with FastQC - v0.12.1
    • FastQC source: Bioinformatics Group at the Babraham Institute, UK.
  7. Build AssemblySet - v1.0.1
    • Chivian D, Jungbluth SP, Dehal PS, Wood-Charlson EM, Canon RS, Allen BH, Clark MM, Gu T, Land ML, Price GA, Riehl WJ, Sneddon MW, Sutormin R, Zhang Q, Cottingham RW, Henry CS, Arkin AP. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat Protoc. 2023 Jan;18(1):208-238. doi: 10.1038/s41596-022-00747-x
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  8. Classify Microbes with GTDB-Tk - v2.3.2
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics, Volume 38, Issue 23, 1 December 2022, Pages 5315 5316. DOI: https://doi.org/10.1093/bioinformatics/btac672
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Donovan H Parks, Maria Chuvochina, Christian Rinke, Aaron J Mussig, Pierre-Alain Chaumeil, Philip Hugenholtz. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Research, Volume 50, Issue D1, 7 January 2022, Pages D785 D794. DOI: https://doi.org/10.1093/nar/gkab776
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Rinke C, Chuvochina M, Mussig AJ, Chaumeil PA, Dav n AA, Waite DW, Whitman WB, Parks DH, and Hugenholtz P. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol. 2021 Jul;6(7):946-959. DOI:10.1038/s41564-021-00918-8
    • Chivian D, Jungbluth SP, Dehal PS, Wood-Charlson EM, Canon RS, Allen BH, Clark MM, Gu T, Land ML, Price GA, Riehl WJ, Sneddon MW, Sutormin R, Zhang Q, Cottingham RW, Henry CS, Arkin AP. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat Protoc. 2023 Jan;18(1):208-238. doi: 10.1038/s41596-022-00747-x
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195
    • Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy AM. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 2016 Jun 20;17(1):132. DOI: 10.1186/s13059-016-0997-x
  9. Trim Reads with Trimmomatic - v0.36
    • Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30: 2114 2120. doi:10.1093/bioinformatics/btu170

Apps in Beta

  1. Classify Microbes with GTDB-Tk - v1.7.0
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Rinke C, Chuvochina M, Mussig AJ, Chaumeil PA, Dav n AA, Waite DW, Whitman WB, Parks DH, and Hugenholtz P. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol. 2021 Jul;6(7):946-959. DOI:10.1038/s41564-021-00918-8
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195