Generated April 2, 2022
# Welcome to the Narrative
from IPython.display import IFrame
IFrame("https://www.kbase.us/narrative-welcome-cell/", width="100%", height="300px")
Out[1]:

IMPORT ASSEMBLY

from biokbase.narrative.jobs.appmanager import AppManager
AppManager().run_app_bulk(
    [{
        "app_id": "kb_uploadmethods/import_fasta_as_assembly_from_staging",
        "tag": "release",
        "version": "31e93066beb421a51b9c8e44b1201aa93aea0b4e",
        "params": [{
            "staging_file_subdir_path": "assembly.fasta",
            "assembly_name": "assembly.fasta",
            "type": "draft isolate",
            "min_contig_length": 500
        }]
    }],
    cell_id="5ab9fd58-ad26-4dc1-ab16-9ab52fc9edb9",
    run_id="b45244f5-ad49-435f-9e61-7da811ff7409"
)

ASSESS ASSEMBLY QUALITY

Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 4m 38s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM

IMPORT RAW READS

from biokbase.narrative.jobs.appmanager import AppManager
AppManager().run_app_bulk(
    [{
        "app_id": "kb_uploadmethods/import_fastq_interleaved_as_reads_from_staging",
        "tag": "release",
        "version": "31e93066beb421a51b9c8e44b1201aa93aea0b4e",
        "params": [{
            "fastq_fwd_staging_file_name": "C9_S41_R2_001.fastq.gz",
            "name": "C9_S41_R2_001.fastq.gz",
            "sequencing_tech": "Illumina",
            "single_genome": 1,
            "read_orientation_outward": 0,
            "insert_size_std_dev": None,
            "insert_size_mean": None
        }]
    }],
    cell_id="1d4117d0-6e2f-486c-b2ea-9a4b7db03235",
    run_id="b61c5e83-9eee-46b8-b3ed-b91b328659e6"
)
from biokbase.narrative.jobs.appmanager import AppManager
AppManager().run_app_bulk(
    [{
        "app_id": "kb_uploadmethods/import_fastq_interleaved_as_reads_from_staging",
        "tag": "release",
        "version": "31e93066beb421a51b9c8e44b1201aa93aea0b4e",
        "params": [{
            "fastq_fwd_staging_file_name": "C9_S41_R1_001.fastq.gz",
            "name": "C9_S41_R1_001.fastq.gz",
            "sequencing_tech": "Illumina",
            "single_genome": 1,
            "read_orientation_outward": 0,
            "insert_size_std_dev": None,
            "insert_size_mean": None
        }]
    }],
    cell_id="b48ff555-83b3-4292-a8f9-99b929319a8c",
    run_id="2763ae14-59f3-4550-b841-c5da458bc78f"
)
Trim paired- or single-end Illumina reads with Trimmomatic.
This app completed without errors in 18m 48s.
Objects
Created Object Name Type Description
C9_Trimmed_R1_R2_fastq.gz_paired PairedEndLibrary Trimmed Reads
C9_Trimmed_R1_R2_fastq.gz_unpaired_fwd SingleEndLibrary Trimmed Unpaired Forward Reads
C9_Trimmed_R1_R2_fastq.gz_unpaired_rev SingleEndLibrary Trimmed Unpaired Reverse Reads
Assemble reads using the HybridSPAdes assembler.
This app completed without errors in 38m 2s.
Objects
Created Object Name Type Description
C9_R1_R2_hybridSPAdes.Assembly Assembly Assembled contigs
Summary
SPAdes results saved to: joval:narrative_1642486466759//kb/module/work/tmp/61af6fa4-8688-42df-9d46-fb6b822bb18d/spades_project_dir/assemble_results Assembly saved to: joval:narrative_1642486466759/C9_R1_R2_hybridSPAdes.Assembly Assembled into 2040 contigs. Avg Length: 4818.255882352942 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 1967 -- 500.0 to 38619.2 bp 45 -- 38619.2 to 76738.4 bp 16 -- 76738.4 to 114857.59999999999 bp 6 -- 114857.59999999999 to 152976.8 bp 3 -- 152976.8 to 191096.0 bp 2 -- 191096.0 to 229215.19999999998 bp 0 -- 229215.19999999998 to 267334.39999999997 bp 0 -- 267334.39999999997 to 305453.6 bp 0 -- 305453.6 to 343572.8 bp 1 -- 343572.8 to 381692.0 bp
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • spades_output.zip - Output file(s) generated by SPAdes-3.15.3
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 4m 42s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM

MERGE READS

Allows users to create a ReadsSet object.
This app completed without errors in 15s.
Objects
Created Object Name Type Description
C9_R1_R2.readset ReadsSet KButil_Build_ReadsSet
Summary
reads libs in output set C9_R1_R2.readset: 2
Merge multiple Reads Libraries and/or ReadsSets into one Reads Library object.
This app completed without errors in 16m 32s.
Objects
Created Object Name Type Description
C9_S41_R1_R2.fastq.gz PairedEndLibrary R1 and R2 fastq of C9_Merge reads
Summary
NUM READS LIBRARIES COMBINED INTO ONE READS LIBRARY: 2
Output from Merge Reads Libraries - v1.0.1
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/112786

ASSESS READS QUALITY

A quality control application for high throughput sequence data.
This app completed without errors in 4m 22s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • C9_S41_R2_001.fastq.gz_107303_5_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • C9_S41_R2_001.fastq.gz_107303_5_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
A quality control application for high throughput sequence data.
This app completed without errors in 4m 22s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • C9_S41_R1_001.fastq.gz_107303_13_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • C9_S41_R1_001.fastq.gz_107303_13_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
A quality control application for high throughput sequence data.
This app completed without errors in 6m 43s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • C9_S41_R1_R2.fastq.gz_107303_17_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • C9_S41_R1_R2.fastq.gz_107303_17_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
A quality control application for high throughput sequence data.
This app completed without errors in 6m 43s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • C9_S41_R1_R2.fastq.gz_107303_17_1.fwd_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report
  • C9_S41_R1_R2.fastq.gz_107303_17_1.rev_fastqc.zip - Zip file generated by fastqc that contains original images seen in the report

CLASSIFY MICROBES

Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 1m 34s.
Links
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 2h 5m 45s.
Links
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 42m 58s.
Links
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 1h 32m 36s.
Links
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 1h 23m 52s.
Links

BIN CONTIGS

Group assembled metagenomic contigs into lineages (Bins) using depth-of-coverage and nucleotide composition
This app completed without errors in 21m 40s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • concoct_result.zip - Files generated by CONCOCT App
v1 - KBaseMetagenomes.BinnedContigs-1.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/112786
Group assembled metagenomic contigs into lineages (Bins) using depth-of-coverage and nucleotide composition
This app completed without errors in 25m 53s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • concoct_result.zip - Files generated by CONCOCT App
Group assembled metagenomic contigs into lineages (Bins) using depth-of-coverage and nucleotide composition
This app completed without errors in 32m 3s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • concoct_result.zip - Files generated by CONCOCT App
Group assembled metagenomic contigs into lineages (Bins) using depth-of-coverage, nucleotide composition, and marker genes.
This app completed without errors in 21m 19s.
Objects
Created Object Name Type Description
Bins.MaxBin2 BinnedContigs BinnedContigs from MaxBin2
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • maxbin_result.zip - File(s) generated by MaxBin2 App
Output from Bin Contigs using MaxBin2 - v2.2.4
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/112786
Extract a bin as an Assembly from a BinnedContig dataset
This app completed without errors in 1m 48s.
Objects
Created Object Name Type Description
Bin.004.fastaBin004_assembly Assembly Assembly object of extracted contigs
Summary
Job Finished Generated Assembly Reference: 107303/27/1
v1 - KBaseGenomeAnnotations.Assembly-5.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/112786
Run QUAST (QUality ASsessment Tool) on a set of Assemblies to assess their quality.
This app completed without errors in 4m 9s.
Summary
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly C9_R1_R2_hybridSPAdes.Assembly Bin.004.fastaBin004_assembly Bin.003.fastaBin003_assembly # contigs (>= 0 bp) 2040 4 1 # contigs (>= 1000 bp) 692 4 1 # contigs (>= 10000 bp) 164 4 1 # contigs (>= 100000 bp) 13 1 1 # contigs (>= 1000000 bp) 0 1 1 Total length (>= 0 bp) 9829242 2958309 5389585 Total length (>= 1000 bp) 8892093 2958309 5389585 Total length (>= 10000 bp) 7884458 2958309 5389585 Total length (>= 100000 bp) 2182639 2835982 5389585 Total length (>= 1000000 bp) 0 2835982 5389585 # contigs 2040 4 1 Largest contig 381692 2835982 5389585 Total length 9829242 2958309 5389585 GC (%) 57.82 52.93 56.64 N50 51564 2835982 5389585 N75 17265 2835982 5389585 L50 51 1 1 L75 127 1 1 # N's per 100 kbp 51.11 0.00 0.00 # predicted genes (unique) 8678 2163 4349 # predicted genes (>= 0 bp) 8036 + 820 part 2166 + 0 part 4397 + 1 part # predicted genes (>= 300 bp) 7095 + 721 part 2013 + 0 part 3855 + 1 part # predicted genes (>= 1500 bp) 1197 + 9 part 407 + 0 part 784 + 0 part # predicted genes (>= 3000 bp) 192 + 0 part 66 + 0 part 147 + 0 part
Links
Extract a bin as an Assembly from a BinnedContig dataset
This app completed without errors in 1m 5s.
Objects
Created Object Name Type Description
Bin.003.fastaBin003_assembly Assembly Assembly object of extracted contigs
Summary
Job Finished Generated Assembly Reference: 107303/26/1

FILTER BINS BY QUALITY

Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes. Creates a new BinnedContigs object with High Quality bins that pass user-defined thresholds for Completeness and Contamination.
This app completed without errors in 11m 3s.
Objects
Created Object Name Type Description
CheckM_HQ_bins.BinnedContigs BinnedContigs HQ BinnedContigs CheckM_HQ_bins.BinnedContigs
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes. Creates a new BinnedContigs object with High Quality bins that pass user-defined thresholds for Completeness and Contamination.
This app completed without errors in 10m 7s.
Objects
Created Object Name Type Description
C9_R1_R2_CheckM_HQ_bins.BinnedContigs BinnedContigs HQ BinnedContigs C9_R1_R2_CheckM_HQ_bins.BinnedContigs
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes. Creates a new BinnedContigs object with High Quality bins that pass user-defined thresholds for Completeness and Contamination.
This app completed without errors in 10m 33s.
Objects
Created Object Name Type Description
CheckM_HQ_bins.MaxBin2.BinnedContigs BinnedContigs HQ BinnedContigs CheckM_HQ_bins.MaxBin2.BinnedContigs
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes. Creates a new BinnedContigs object with High Quality bins that pass user-defined thresholds for Completeness and Contamination.
This app completed without errors in 15m 32s.
Objects
Created Object Name Type Description
C9.HybridSpades.CheckM_HQ_bins.BinnedContigs BinnedContigs HQ BinnedContigs C9.HybridSpades.CheckM_HQ_bins.BinnedContigs
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM

ANNOTATE GENOME

Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 3m 17s.
Objects
Created Object Name Type Description
Bin004.Prokka_annotate.genome Genome Annotated Genome
Summary
Annotated Genome saved to: joval:narrative_1642486466759/Bin004.Prokka_annotate.genome Number of genes predicted: 2901 Number of protein coding genes: 2854 Number of genes with non-hypothetical function: 1446 Number of genes with EC-number: 680 Number of genes with Seed Subsystem Ontology: 0 Average protein length: 301 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka - v1.14.5
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/112786
View distributions of contig characteristics for different assemblies.
This app completed without errors in 4m 12s.
Summary
ASSEMBLY STATS for Bin.004.fastaBin004_assembly Len longest contig: 2835982 bp N50 (L50): 2835982 (1) N75 (L75): 2835982 (1) N90 (L90): 2835982 (1) Num contigs >= 1000000 bp: 1 Num contigs >= 100000 bp: 1 Num contigs >= 10000 bp: 4 Num contigs >= 1000 bp: 4 Num contigs >= 500 bp: 4 Num contigs >= 1 bp: 4 Len contigs >= 1000000 bp: 2835982 bp Len contigs >= 100000 bp: 2835982 bp Len contigs >= 10000 bp: 2958309 bp Len contigs >= 1000 bp: 2958309 bp Len contigs >= 500 bp: 2958309 bp Len contigs >= 1 bp: 2958309 bp
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • key_plot.png
  • key_plot.pdf
  • cumulative_len_plot.png
  • cumulative_len_plot.pdf
  • sorted_contig_lengths.png
  • sorted_contig_lengths.pdf
  • histogram_figures.zip
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 3m 32s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • Bin004.Cyabacterium.tree.newick
  • Bin004.Cyabacterium.tree-labels.newick
  • Bin004.Cyabacterium.tree.png
  • Bin004.Cyabacterium.tree.pdf
Annotate your genome(s) with DRAM. Annotations will then be distilled to create an interactive functional summary per genome.
This app completed without errors in 14h 33m 31s.
Summary
Here are the results from your DRAM run.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • annotations.tsv - DRAM annotations in a tab separate table format
  • genes.faa - Genes as amino acids predicted by DRAM with brief annotations
  • product.tsv - DRAM product in tabular format
  • metabolism_summary.xlsx - DRAM metabolism summary tables
  • genome_stats.tsv - DRAM genome statistics table
Allows users to compute a pangenome from a set of individual genomes.
This app completed without errors in 33m 52s.
Objects
Created Object Name Type Description
Bin004Refs.pangenome Pangenome Pangenome
Summary
Pangenome saved to joval:narrative_1642486466759/Bin004Refs.pangenome
Compare isofunctional and homologous gene families for all genomes in a Pangenome.
This app completed without errors in 5m 11s.
Objects
Created Object Name Type Description
Bin004Refs.genomecomparison GenomeComparison GenomeComparison
Summary
GenomeComparison saved to joval:narrative_1642486466759/Bin004Refs.genomecomparison
v1 - KBaseGenomes.GenomeComparison-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/112786
Output from Compare Genomes from Pangenome
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/112786
v1 - KBaseGenomes.GenomeComparison-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/112786
Generate a map and annotations of circular genomes using CGView.
This app completed without errors in 3m 14s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/112786
  • KBase_derived_Bin004.Prokka_annotate.genome.png
  • KBase_derived_Bin004.Prokka_annotate.genome.jpg
  • KBase_derived_Bin004.Prokka_annotate.genome.svg

Apps

  1. Annotate and Distill Genomes with DRAM
    • DRAM source code
    • DRAM documentation
    • DRAM publication
  2. Annotate Assembly and Re-annotate Genomes with Prokka - v1.14.5
    • Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30: 2068 2069. doi:10.1093/bioinformatics/btu153
  3. Assemble Reads with HybridSPAdes - v3.15.3
    • [1] Nurk S, Bankevich A, Antipov D, Gurevich A, Korobeynikov A, Lapidus A, et al. Assembling Genomes and Mini-metagenomes from Highly Chimeric Reads. In: Deng M, Jiang R, Sun F, Zhang X, editors. Research in Computational Molecular Biology. Springer Berlin Heidelberg; 2013. pp. 158 170.Nurk, Bankevich et al., 2013. doi: 10.1007/978-3-642-37195-0_13
    • [2] Antipov D, Korobeynikov A, McLean J, Pevzner P. HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads. Bioinformatics. 2016;32. 1009-1015. doi: 10.1093/bioinformatics/btv688
    • [3] Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics. 2020 Jun;70(1):e102. doi: 10.1002/cpbi.102.
  4. Assess Genome Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  5. Assess Quality of Assemblies with QUAST - v4.4
    • [1] Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29: 1072 1075. doi:10.1093/bioinformatics/btt086
    • [2] Mikheenko A, Valin G, Prjibelski A, Saveliev V, Gurevich A. Icarus: visualizer for de novo assembly evaluation. Bioinformatics. 2016;32: 3321 3323. doi:10.1093/bioinformatics/btw379
  6. Assess Read Quality with FastQC - v0.11.9
    • FastQC source: Bioinformatics Group at the Babraham Institute, UK.
  7. Bin Contigs using CONCOCT - v1.1
    • Alneberg J, Bjarnason BS, de Bruijn I, Schirmer M, Quick J, Ijaz UZ, Lahti L, Loman NJ, Andersson AF, Quince C. Binning metagenomic contigs by coverage and composition. Nature Methods. 2014;11: 1144-1146. doi:10.1038/nmeth.3103
    • CONCOCT source:
  8. Bin Contigs using MaxBin2 - v2.2.4
    • Wu Y-W, Simmons BA, Singer SW. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics. 2016;32: 605 607. doi:10.1093/bioinformatics/btv638 (2) 1. Wu Y-W, Tang Y-H, Tringe SG, Simmons BA, Singer SW. MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome. 2014;2: 26. doi:10.1186/2049-2618-2-26
    • Wu Y-W, Tang Y-H, Tringe SG, Simmons BA, Singer SW. MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome. 2014;2: 26. doi:10.1186/2049-2618-2-26
    • Maxbin2 source:
    • Maxbin source:
  9. Build ReadsSet - v1.7.6
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  10. Circular Genome Visualization Tool
    no citations
  11. Classify Microbes with GTDB-Tk - v1.7.0
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Rinke C, Chuvochina M, Mussig AJ, Chaumeil PA, Dav n AA, Waite DW, Whitman WB, Parks DH, and Hugenholtz P. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol. 2021 Jul;6(7):946-959. DOI:10.1038/s41564-021-00918-8
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195
  12. Compare Assembled Contig Distributions - v1.1.2
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  13. Compare Genomes from Pangenome
    • Medini D, Donati C, Tettelin H, Masignani V, Rappuoli R. The microbial pan-genome. Curr Opin Genet Dev. 2005;15: 589 594. doi:10.1016/j.gde.2005.09.006
    • Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D, Ward NL, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial pan-genome. Proc Natl Acad Sci U S A. 2005;102: 13950 13955. doi:10.1073/pnas.0506758102
    • Rasko DA, Rosovitz MJ, Myers GSA, Mongodin EF, Fricke WF, Gajer P, et al. The Pangenome Structure of Escherichia coli: Comparative Genomic Analysis of E. coli Commensal and Pathogenic Isolates. J Bacteriol. 2008;190: 6881 6893. doi:10.1128/JB.00619-08
  14. Compute Pangenome
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  15. Extract Bins as Assemblies from BinnedContigs - v1.0.2
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  16. Filter Bins by Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  17. Insert Genome Into SpeciesTree - v2.2.0
    • Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490
  18. Merge Reads Libraries - v1.0.1
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  19. Trim Reads with Trimmomatic - v0.36
    • Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30: 2114 2120. doi:10.1093/bioinformatics/btu170