Generated October 26, 2023

Supporting Data for: Genomic and environmental controls on Castellaniella biogeography in an anthropogenically disturbed site

Narrative created by Elizabeth G. Szink and Jennifer L. Goff

  1. Elizabeth G. Szink; 1,2. Jennifer L. Goff [ORCiD: 0000-0002-9089-9632]; 1. Konnor L. Durrence; 3. Lauren M. Lui [ORCiD: 0000-0001-8720-5268]; 3. Torben N. Nielsen [ORCiD: 0000-0002-0987-7189]; 3. Jennifer V. Kuehl [ORCiD: 0000-0003-2813-2518]; 4. Kristopher A. Hunt; 3. John-Marc Chandonia [ORCiD: 0000-0002-5153-9079]; 1. Michael P. Thorgersen [ORCiD: 0000-0001-9552-762X]; 1. Farris L. Poole II; 4. David A. Stahl [ORCiD: 0000-0003-3051-841X]; 5. Romy Chakraborty [ORCiD: 0000-0001-9326-554X]; 3,6. Adam P. Arkin [ORCiD: 0000-0002-4999-2931]; & 1. Michael W. W. Adams [ORCiD: 0000-0002-9796-5014]
  1. Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, USA
  2. Department of Chemistry, State University of New York College of Environmental Science and Forestry,
  3. Environmental Genomics and Systems Biology Division, E.O. Lawrence Berkeley National Laboratory, Berkeley, CA, USA
  4. Department of Civil and Environmental Engineering, University of Washington, Seattle, WA, USA
  5. Earth and Environmental Science Area, E.O. Lawrence Berkeley National Laboratory, Berkeley, CA, USA
  6. Department of Bioengineering, University of California, Berkeley, CA, USA

submitted to [insert journal name] on [insert date]

Narrative Sections

  1. Import and annotation of Castellaniella genomes
  2. Pangenome analyses

I. Import and annotation of Castellaniella genomes


  1. Genome assemblies were imported into KBase using the Batch Import Assembly from Staging Area (v1.0.57) function.
  2. All assemblies were annotated using the Annotated Multiple Microbial Assemblies with RASTtk - v1.073 tool .
  3. Annotated genomes were grouped into sets using the Add Genomes to GenomeSet - v1.7.6 function. Individual annotated genomes can be found both below and in the Data menu to the left.
  4. Taxonomy was assigned using the Classify Microbes with GTDB-Tk-v1.7.0 tool. The results of this analysis are shown below.
v3 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v3 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseGenomes.Genome-11.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v2 - KBaseSearch.GenomeSet-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 44m 56s.
Objects
Created Object Name Type Description
65Phen_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
GW247-6E4_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
HJPbin40_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
DSM1214_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
CCUG39790_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
MT123_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
CD04_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
NBRC101664_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
FW104-7C03_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
FW104-12G02_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
FW104-16D08_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
FW104-7G2B_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
FW021bin21_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
DR1149_FASTA.RAST Genome Taxonomy unchanged, taxon_assignment added GTDB
Castellaniella_Genome_Set GenomeSet Taxonomy unchanged, taxon_assignment added GTDB
Links

II. Pangenome Analyses


  1. Analysis of the Castellaniella pangenome was performed using the Compute Pangenome (v0.0.7) tool.
  2. Using the same method, we also computed the ORR-specific and non-ORR Castellaniella pangenomes. All pangenome results (including the presence/absence matrix) can be found below.
v1 - KBaseGenomes.Pangenome-4.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v1 - KBaseGenomes.Pangenome-4.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835
v1 - KBaseGenomes.Pangenome-4.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/135835

Apps

  1. Classify Microbes with GTDB-Tk - v2.3.2
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics, Volume 38, Issue 23, 1 December 2022, Pages 5315 5316. DOI: https://doi.org/10.1093/bioinformatics/btac672
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Donovan H Parks, Maria Chuvochina, Christian Rinke, Aaron J Mussig, Pierre-Alain Chaumeil, Philip Hugenholtz. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Research, Volume 50, Issue D1, 7 January 2022, Pages D785 D794. DOI: https://doi.org/10.1093/nar/gkab776
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Rinke C, Chuvochina M, Mussig AJ, Chaumeil PA, Dav n AA, Waite DW, Whitman WB, Parks DH, and Hugenholtz P. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol. 2021 Jul;6(7):946-959. DOI:10.1038/s41564-021-00918-8
    • Chivian D, Jungbluth SP, Dehal PS, Wood-Charlson EM, Canon RS, Allen BH, Clark MM, Gu T, Land ML, Price GA, Riehl WJ, Sneddon MW, Sutormin R, Zhang Q, Cottingham RW, Henry CS, Arkin AP. Metagenome-assembled genome extraction and analysis from microbiomes using KBase. Nat Protoc. 2023 Jan;18(1):208-238. doi: 10.1038/s41596-022-00747-x
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195
    • Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy AM. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 2016 Jun 20;17(1):132. DOI: 10.1186/s13059-016-0997-x