Generated June 23, 2021

Inhibition of a nutritional endosymbiont by glyphosate abolishes mutualistic benefit on cuticle synthesis in Oryzaephilus surinamensis

Julian Simon Thilo Kiefer*, Suvdanselengee Batsukh, Eugen Bauer, Bin Hirota, Benjamin Weiss, Jürgen C. Wierz, Takema Fukatsu, Martin Kaltenpoth & Tobias Engl

*Author of KBase Narrative

Abstract

Glyphosate is widely used as a herbicide, but recent studies begin to reveal its detrimental side effects on animals by targeting the shikimate pathway of associated gut microorganisms. However, its impact on nutritional endosymbionts in insects remains poorly understood. Here, we sequenced the tiny, shikimate pathway encoding symbiont genome of the sawtoothed grain beetle Oryzaephilus surinamensis. Decreased titers of the aromatic amino acid tyrosine in symbiont-depleted beetles underscore the symbionts’ ability to synthesize prephenate as the precursor for host tyrosine synthesis and its importance for cuticle sclerotization and melanization. Glyphosate exposure inhibited symbiont establishment during host development and abolished the mutualistic benefit on cuticle synthesis in adults, which could be partially rescued by dietary tyrosine supplementation. Furthermore, phylogenetic analyses indicate that the shikimate pathways of many nutritional endosymbionts likewise contain a glyphosate sensitive 5-enolpyruvylshikimate-3-phosphate synthase. These findings highlight the importance of symbiont-mediated tyrosine supplementation for cuticle biosynthesis in insects, but also paint an alarming scenario regarding the use of glyphosate in light of recent declines in insect populations.

Cite the paper: Kiefer, J.S.T., Batsukh, S., Bauer, E. et al. Inhibition of a nutritional endosymbiont by glyphosate abolishes mutualistic benefit on cuticle synthesis in Oryzaephilus surinamensis. Commun Biol 4, 554 (2021). https://doi.org/10.1038/s42003-021-02057-6

Narrative Table of Contents

Note regarding ordering: App cells in the Narrative are arranged by type and as such the arrangement of the app cells within the Narrative does not strictly align with the order the cells were run or the analytical workflow carried out. Please see the methods section of the paper (part of which is included in the "Background and Experimental Methods" cell below) for a written explanation of the workflow, and view the Data Panel objects sorted by age to see the development of the objects as they were added.

  1. Introduction and Experimental Methods
  2. Importing Files
  3. Assess Genome Quality with CheckM
  4. Annotate Metagenomes, Genomes, and Domains
  5. Build GenomeSets
  6. Insert Genomes and GenomeSets into Trees
  7. Build Pangenomes with OrthoMCL
  8. View Function Profiles
  9. References

Introduction and Experimental Methods

Insect cultures

The initial Oryzaephilus surinamensis culture (strain JKI) was obtained from the Julius-Kühn-Institute/Federal Research Center for Cultivated Plants (Berlin, Germany) in 2014 and kept in culture since then. Continuous symbiotic and aposymbiotic (see below) O. surinamensis cultures were maintained in 1.8-L plastic containers, filled with 50 g oat flakes, at 28 °C, 60% relative humidity and a day and night cycles of 16–8 h. Another O. surinamensis population (strain OsNFRI) was obtained from the National Food Research Institute (Tsukuba, Japan) and used for genome sequencing.

Elimination of O. surinamensis symbionts

An O. surinamensis sub-population was treated for 12 weeks with tetracycline (150 mg/5g oat flakes) to eliminate their symbionts and then kept for several generations on a normal diet to exclude direct effects of tetracyclin on the host physiology1. Before the following experiments the aposymbiotic status of this beetle sub-population was confirmed. Therefore, 10 female adult beetles were individually separated in a single jar with oat flakes to lay eggs, as were symbiotic beetles in parallel populations. After 4 weeks, the adult generation was removed before their offspring finished metamorphosis and DNA of these females extracted and the symbiont titer was analyzed by quantitative PCR (see below;1).

Symbiont genome sequencing, assembly, and annotation

Total DNA was isolated from 20 pooled adult abdomina (without wings) of O. surinamensis JKI using the Epicentre MasterPureTM Complete DNA and RNA Purification Kit (Illumina Inc., Madison, WI, USA) including RNase digestion. Short-read library preparation and sequencing was performed at the Max-Planck-Genome-center Cologne, Germany (SRR12881563–SRR12881566) on a HiSeq2500 Sequencing System (Illumina Inc., Madison, WI, USA). Two further libraries were created from O. surinamensis strain OsNFRI. For the first library the DNA was extracted by QIAamp DNA Mini Kit (Qiagen, Germany) from 210 bacteriomes dissected from 60 adults. The library was prepared using the Nextera XT DNA Library Preparation Kit (Illumina Inc., Madison, WI, USA) and sequenced on a MiSeq (Illumina Inc., Madison, WI, USA) of AIST (Japan). For the second library the DNA was extracted by QIAamp DNA Micro Kit (Qiagen, Germany) from 24 bacteriomes dissected from six adults (each individual beetle contains four separate bacteriomes). The library was prepared using the Nextera DNA Library Preparation Kit (Illumina Inc., Madison, WI, USA) and sequenced on a NovaSeq 6000 (Illumina Inc., Madison, WI, USA) of Novagen (China). Adaptor and quality trimming was performed with Trimmomatic2. In addition, we used two publicly available metagenome libraries of O. surinamensis (SRR5279855 and SRR6426882).

Long-read sequencing (SRR12881567–SRR12881568) was performed on a MinION Mk1B Sequencing System (Oxford Nanopore Technologies (ONT), Oxford, UK). Upon receipt of flowcells, and again immediately prior to sequencing, the number of pores on flowcells was measured using the MinKNOW software (v18.12.9 and 19.05.0, ONT, Oxford, UK). Flowcells were replaced into their packaging, sealed with parafilm and tape, and stored at 4 °C until use. Library preparation was performed with the Ligation Sequencing Kit (SQK-LSK109, ONT, Oxford, UK) and completed libraries were loaded on a flowcell (FLO-MIN106D, ONT, Oxford, UK) following the manufacturer’s instructions.

Quality-controlled long reads were mapped using a custom-made kraken2 database containing the publicly available genomes of Bacteroidetes bacteria3,4 to filter beetle-associated sequences using the supercomputer Mogon of the Johannes Gutenberg-University (Mainz, Germany). Hybrid assembly of MinION and Illumina reads was performed using SPAdes (v3.13.0) with the default settings5. This resulted in ~70,000 contigs that were then binned using BusyBee Web6, screened for GC content and taxonomic identity to Bacteroidetes bacteria, and additionally checked manually for tRNAs and ribosomal proteins of Bacteroidetes bacteria. In total, 13 contigs were extracted, which were then automatically annotated with RAST (see app citations) using the app Annotate Microbial Assembly (RAST_SDK v0.1.1) on KBase (see app citations). The annotated contigs were plotted using CIRCOS109 (v0.69-6) for the visualization of gene locations, GC content and coverage. Additionally, the completeness of the obtained genome was assessed with the app Assess Genome Quality with CheckM—v1.0.18 in KBase (see app citations).

Phylogenetic analyses

A phylogenetic tree for placement of the intracellular symbiont of O. surinamensis within the Bacteroidetes was reconstructed using the KBase app Insert Set of Genomes Into Species Tree v2.1.10 (SpeciesTreeBuilder v0.0.12) based on the FastTree2 algorithm (see app citations), including 49 highly conserved Clusters of Orthologous Groups (COG) genes111.

A phylogenetic tree of the aroA gene (which codes for the EPSPS enzyme in the shikimate pathway) from the symbiont of O. surinamensis to predict its sensitivity to glyphosate was performed according to Motta et al. 7. Manually selected aroA sequences from plants, gut bacteria as well as several intracellular insect symbionts were obtained from Uniprot (UniProt Consortium 2019), translated and aligned using MUSCLE112 (v3.8.425) implemented in Geneious Prime 2019 (v2019.1.3, https://www.geneious.com). Phylogenetic reconstruction was performed with FastTree110 (v2.1.12) and PhyML113 (v2.2.4) implemented in Geneious Prime 2019 (v2019.1.3, https://www.geneious.com) using the Jones–Taylor–Thorton model with 20 rate categories and an optimized Gamma20 likelyhood (FastTree) and 1000 bootstrap replicates (PhyML). The obtained trees were visualized using FigTree (v1.4.4, http://tree.bio.ed.ac.uk/software/figtree/).

Comparison with other Bacteroidetes bacteria

Previously published Bacteroidetes genomes were re-annotated with RAST (see app citations) in KBase to compare the bacteria and to estimate the genome-wide nucleotide sequence divergence level. Therefore, we identified single-copy orthologs in each genome pair using OrthoMCL (see app citations) (v2.0) in KBase. KEGG categories were then assessed via GhostKOALA8 (v2.2) of each gene’s amino acid sequence. Heatmaps were visualized using the ‘ComplexHeatmap´ package in Rstudio (V 1.1.463 with R V3.6.3). CIRCOS9 (v0.69-6) was used to link orthologous genes.

Genomes of Bacteroidetes bacteria and other bacteria described as cuticle supplementing symbionts were compared in KBase in more detail. Therefore, all genomes were re-annotated with RAST and used to classify all annotated genes according to the SEED Subsystem using the app View Function Profile for Genomes (v1.4.0, SEED Functional Group: Amino Acids and Derivatives; see app citations). The resulting raw count of genes with annotation was visualized as a heatmap using the function ‘heatmap.2’ in the ‘ggplot´ package in Rstudio (V 1.1.463 with R V3.6.3).

Importing Files

Import a GenBank file from your staging area into your Narrative as a Genome data object
This app completed without errors in 1m 42s.
Objects
Created Object Name Type Description
Blattabacterium_clevelandi.gbk_genome Genome Imported Genome
Links
Output from Import GenBank File as Genome from Staging Area
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Import a GenBank file from your staging area into your Narrative as a Genome data object
This app completed without errors in 1m 41s.
Objects
Created Object Name Type Description
Sulcia_muelleri.gbk_genome Genome Imported Genome
Links
Output from Import GenBank File as Genome from Staging Area
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Import a GenBank file from your staging area into your Narrative as a Genome data object
This app completed without errors in 3m 22s.
Objects
Created Object Name Type Description
Sulcia_muelleri_PSPU.gbk_genome Genome Imported Genome
Links
Output from Import GenBank File as Genome from Staging Area
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Import a GenBank file from your staging area into your Narrative as a Genome data object
This app completed without errors in 1m 40s.
Objects
Created Object Name Type Description
Uzinura_diaspidicola.gbk_genome Genome Imported Genome
Links
Output from Import GenBank File as Genome from Staging Area
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Import a GenBank file from your staging area into your Narrative as a Genome data object
This app completed without errors in 3m 22s.
Objects
Created Object Name Type Description
Walczuchella_monophlebidarum.gbk_genome Genome Imported Genome
Links
Output from Import GenBank File as Genome from Staging Area
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Import a FASTA file from your staging area into your Narrative as an Assembly data object
This app completed without errors in 59s.
Objects
Created Object Name Type Description
Nardonella_EPO_assembly Assembly Imported Assembly
Links
Import a FASTA file from your staging area into your Narrative as an Assembly data object
This app completed without errors in 59s.
Objects
Created Object Name Type Description
13contigs_concat.fasta_assembly Assembly Imported Assembly
Links
Import a FASTA file from your staging area into your Narrative as an Assembly data object
This app completed without errors in 60s.
Objects
Created Object Name Type Description
13contigs_single.fasta_assembly Assembly Imported Assembly
Links

Assess Genome Quality with CheckM

Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 16m 28s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/58799
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM

Annotate Metagenomes, Genomes, and Domains

Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 2m 0s.
Objects
Created Object Name Type Description
13contigs_single.fasta_Prokka Genome Annotated Genome
Summary
Annotated Genome saved to: jkiefer:narrative_1585651424215/13contigs_single.fasta_Prokka Number of genes predicted: 324 Number of protein coding genes: 293 Number of genes with non-hypothetical function: 270 Number of genes with EC-number: 125 Number of genes with Seed Subsystem Ontology: 114 Average protein length: 286 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 1m 47s.
Objects
Created Object Name Type Description
Candidatus_Walczuchella_monophlebidarum.gbff_genome_assembly.PROKKA Genome Annotated Genome
Summary
Annotated Genome saved to: jkiefer:narrative_1585651424215/Candidatus_Walczuchella_monophlebidarum.gbff_genome_assembly.PROKKA Number of genes predicted: 334 Number of protein coding genes: 296 Number of genes with non-hypothetical function: 292 Number of genes with EC-number: 144 Number of genes with Seed Subsystem Ontology: 132 Average protein length: 298 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 1m 36s.
Objects
Created Object Name Type Description
Candidatus_Uzinura_diaspidicola_str._ASNER.gbff_genome_assembly.PROKKA Genome Annotated Genome
Summary
Annotated Genome saved to: jkiefer:narrative_1585651424215/Candidatus_Uzinura_diaspidicola_str._ASNER.gbff_genome_assembly.PROKKA Number of genes predicted: 282 Number of protein coding genes: 247 Number of genes with non-hypothetical function: 250 Number of genes with EC-number: 124 Number of genes with Seed Subsystem Ontology: 115 Average protein length: 303 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 1m 46s.
Objects
Created Object Name Type Description
Candidatus_Sulcia_muelleri_PSPU.gbff_genome_assembly.PROKKA Genome Annotated Genome
Summary
Annotated Genome saved to: jkiefer:narrative_1585651424215/Candidatus_Sulcia_muelleri_PSPU.gbff_genome_assembly.PROKKA Number of genes predicted: 294 Number of protein coding genes: 260 Number of genes with non-hypothetical function: 268 Number of genes with EC-number: 133 Number of genes with Seed Subsystem Ontology: 119 Average protein length: 337 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 1m 22s.
Objects
Created Object Name Type Description
Candidatus_Sulcia_muelleri.gbff_genome_assembly.PROKKA Genome Annotated Genome
Summary
Annotated Genome saved to: jkiefer:narrative_1585651424215/Candidatus_Sulcia_muelleri.gbff_genome_assembly.PROKKA Number of genes predicted: 182 Number of protein coding genes: 149 Number of genes with non-hypothetical function: 167 Number of genes with EC-number: 58 Number of genes with Seed Subsystem Ontology: 51 Average protein length: 302 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate Assembly and Re-annotate Genomes with Prokka annotation pipeline.
This app completed without errors in 1m 46s.
Objects
Created Object Name Type Description
Blattabacterium_clevelandi_strain_CCLhc_.gbff_genome_assembly.PROKKA Genome Annotated Genome
Summary
Annotated Genome saved to: jkiefer:narrative_1585651424215/Blattabacterium_clevelandi_strain_CCLhc_.gbff_genome_assembly.PROKKA Number of genes predicted: 599 Number of protein coding genes: 563 Number of genes with non-hypothetical function: 513 Number of genes with EC-number: 309 Number of genes with Seed Subsystem Ontology: 282 Average protein length: 336 aa.
Output from Annotate Assembly and Re-annotate Genomes with Prokka(v1.12)
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 1m 18s.
Objects
Created Object Name Type Description
Nardonella_EPO_assebmly_RAST Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 1 contigs containing 219841 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 245 new features were called, of which 30 are non-coding.
Output genome has the following feature types:
	Coding gene                      215 
	Non-coding rna                    30 
Overall, the genes have 152 distinct functions. 
The genes include 158 genes with a SEED annotation ontology across 142 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 1m 20s.
Objects
Created Object Name Type Description
13contigs_single.fasta_RAST Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 13 contigs containing 307680 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 331 new features were called, of which 32 are non-coding.
Output genome has the following feature types:
	Coding gene                      299 
	Non-coding repeat                  2 
	Non-coding rna                    30 
Overall, the genes have 237 distinct functions. 
The genes include 210 genes with a SEED annotation ontology across 200 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 1m 20s.
Objects
Created Object Name Type Description
13contigs_concat.fasta_RAST Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 1 contigs containing 307740 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 329 new features were called, of which 33 are non-coding.
Output genome has the following feature types:
	Coding gene                      296 
	Non-coding repeat                  2 
	Non-coding rna                    31 
Overall, the genes have 234 distinct functions. 
The genes include 210 genes with a SEED annotation ontology across 198 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.
This app completed without errors in 32m 34s.
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7 Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7
Annotate domains in every Genome within a GenomeSet using protein domains from widely used domain libraries.
This app completed without errors in 16m 29s.
Summary
Search Domains output: Getting DomainModelSet from storage. Getting Genome from storage. Running domain search against library 2959/1/7 Running domain search against library 2959/6/6 Running domain search against library 2959/7/6 Running domain search against library 2959/4/6 Running domain search against library 2959/5/7
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 5m 33s.
Objects
Created Object Name Type Description
Flavobacterium_johnsoniae_assembly_RAST Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 1 contigs containing 6096872 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 5498 new features were called, of which 138 are non-coding.
Output genome has the following feature types:
	Coding gene                     5360 
	Non-coding repeat                 59 
	Non-coding rna                    79 
Overall, the genes have 2640 distinct functions. 
The genes include 2572 genes with a SEED annotation ontology across 1026 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799

Build GenomeSets

Allows users to create a GenomeSet object.
This app completed without errors in 19s.
Objects
Created Object Name Type Description
symbiont_compare_v2 GenomeSet KButil_Build_GenomeSet
Summary
genomes in output set symbiont_compare_v2: 11
Output from Build GenomeSet - v1.0.1
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
v4 - KBaseSearch.GenomeSet-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
Allows users to create a GenomeSet object.
This app completed without errors in 8s.
Objects
Created Object Name Type Description
Bacteroidetes_phylo GenomeSet KButil_Build_GenomeSet
Summary
genomes in output set Bacteroidetes_phylo: 7
Output from Build GenomeSet - v1.0.1
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/58799
v1 - KBaseSearch.GenomeSet-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/58799

Insert Genomes and GenomeSets into Trees

Add a user-provided GenomeSet to a KBase SpeciesTree.
This app completed without errors in 4m 26s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/58799
  • Bacteroidetes-Phylo-Tree.newick
  • Bacteroidetes-Phylo-Tree-labels.newick
  • Bacteroidetes-Phylo-Tree.png
  • Bacteroidetes-Phylo-Tree.pdf

Build Pangenomes with OrthoMCL

Create a Pangenome object by performing OrthoMCL orthologous groups construction on a set of Genomes.
This app completed without errors in 8m 12s.
Objects
Created Object Name Type Description
symbiont_compare_v2_PAN Pangenome Pangenome object
Summary
Input genomes: 13 Output orthologs: 2767
v1 - KBaseGenomes.Pangenome-4.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/58799

View Function Profiles

Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 16s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 1s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 46s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 3s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 2s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 60s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 57s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 11s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app produced errors in 28s.
No output found.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app produced errors in 28s.
No output found.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 49s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app produced errors in 27s.
No output found.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 11s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 47s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 55s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 57s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 55s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 56s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 57s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 47s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 1s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 54s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 55s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 55s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app produced errors in 28s.
No output found.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 58s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 60s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 59s.
Examine the general functional distribution or specific functional gene families for a GenomeSet.
This app completed without errors in 1m 3s.

References

  1. Engl T, et al. Ancient symbiosis confers desiccation resistance to stored grain pest beetles. Mol. Ecol. 2018;27:2095–2108. doi: 10.1111/mec.14418.
  2. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170.
  3. Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15:1–12. doi: 10.1186/gb-2014-15-3-r46.
  4. Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019;20:257. doi: 10.1186/s13059-019-1891-0.
  5. Bankevich A, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 2015;19:455–477. doi: 10.1089/cmb.2012.0021.
  6. Laczny CC, et al. BusyBee Web: metagenomic data analysis by bootstrapped supervised binning and annotation. Nucleic Acids Res. 2017;45:W171–W179. doi: 10.1093/nar/gkx348.
  7. Motta, E. V. S., Raymann, K. & Moran, N. A. Glyphosate perturbs the gut microbiota of honey bees. Proc. Natl. Acad. Sci. USA 115, 10305–10310 (2018).
  8. Kanehisa, M., Sato, Y. & Morishima, K. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J. Mol. Biol. 428, 726–731 (2016).
  9. Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).

Apps

  1. Annotate Assembly and Re-annotate Genomes with Prokka - v1.14.5
    • Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30: 2068 2069. doi:10.1093/bioinformatics/btu153
  2. Annotate Domains in a GenomeSet
    • Altschul SF, Madden TL, Sch ffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389 3402. doi:10.1093/nar/25.17.3389
    • Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10: 421. doi:10.1186/1471-2105-10-421
    • Eddy SR. Accelerated Profile HMM Searches. PLOS Computational Biology. 2011;7: e1002195. doi:10.1371/journal.pcbi.1002195
    • Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279 D285. doi:10.1093/nar/gkv1344
    • Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41: D387 D395. doi:10.1093/nar/gks1234
    • Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46: D493 D496. doi:10.1093/nar/gkx922
    • Letunic I, Doerks T, Bork P. SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. 2015;43: D257-260. doi:10.1093/nar/gku949
    • Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45: D200 D203. doi:10.1093/nar/gkw1129
    • Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, Nelson WC, et al. TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 2007;35: D260-264. doi:10.1093/nar/gkl1043
    • Tatusov RL, Koonin EV, Lipman DJ. A Genomic Perspective on Protein Families. Science. 1997;278: 631 637. doi:10.1126/science.278.5338.631
  3. Annotate Microbial Assembly with RASTtk - v1.073
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al.vThe SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698.
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  4. Assess Genome Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  5. Build GenomeSet - v1.0.1
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163
  6. Build Pangenome with OrthoMCL - v2.0
    • Li L, Stoeckert CJ, Roos DS. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Res. 2003;13: 2178 2189. doi:10.1101/gr.1224503
  7. Import FASTA File as Assembly from Staging Area
    no citations
  8. Import GenBank File as Genome from Staging Area
    no citations
  9. Insert Set of Genomes Into SpeciesTree - v2.2.0
    • Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490
  10. View Function Profile for Genomes - v1.4.0
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163