Generated May 5, 2024

Ochrobactrum anthropi sp. WV_118_8 Genome

Abstract

Microbial consortia can be applied to the bioremediation of trifluoroacetate (TFA), a highly persistent environmental contaminant. Here we report the genome sequences of a Ochrobactrum anthropi WV_118_8 (referred to in this narrative by an older strain name: PFAS_118_8) —isolated from a TFA-degrading mixed culture.

Introduction

The publication by Jennifer L. Goff, Chris Hahn, Rebecca A. Ingrassia, Chanistha Tiyapun, Gray G. Waldschmidt, Emma R. Smith, Miranda N. Marini, Olga Shevchenko, Julia A. Maresca can be found here: [URL]

Ochrobactrum anthropi is a member of the Brucellaceae, a common soil dwelling organism, and emerging human pathagon(1). This species is well suited to bioremediation due to its many genes for solvent, and metal resistance in enviromental strains and it's ability to use a range of carbon sorces(2,3,4).

Note that WV_118_8 = PFAS_118_8

Table of Contents

  1. Background and Experimental Methods
  2. Import and annotation
  3. QC, Assembly, and Annotation
  4. Taxonomic Classification
  5. Metabolic Modeling and Flux Balance Analysis
  6. References

Narrative created by: Gray G. Waldschmidt

Background and Experimental Methods

Sample Collection

Samples were collected from the primary influent at the Wilmington Wastewater Treatment Plant (39.735° N, -75.518° W) in April 2019 and inoculated into a minimal medium amended with 2 μM fluoroacetic acid (MFA).

Isolation

The culture was transfered into fresh medium several times before being archived in 12% glycerol in March 2020 and revived in July 2021 on the same medium. After one transfer into minimal medium amended with 200 uM MFA and incubation for ~1 month, 20 μL of this culture was plated on solid minimal media amended with 1% yeast extract. Four colony types were visible on this medium, so 2 representatives of each colony type were picked and grown in LB for ~2 days, then harvested. DNA was extracted using a phenol-chloroform extraction protocol optimized for Gram-positive bacteria (5), and the 16S gene of each isolate was amplified and sequenced using primers 8F and 1492R to eliminate duplicate strains (6). Using the 16S rRNA gene sequence, a putative Brucella sp. was identified among the 8 isolates. This isolate was given the strain designator PFAS_118_8 and its genome was sequenced.

Genome Sequencing

A single-molecule real-time (SMRT) library was barcoded and prepared using a PacBio SMRTbell Express template preparation kit version 2.0. DNA fragments larger than 6 kb were size selected using BluePippin (Sage Science). The average library fragment size was 12 kb, as measured by a fragment analyzer (Advanced Analytical Technologies, Inc.). Sequencing was completed on a PacBio Sequel IIe single-molecule sequencer in one 1M version 3 LR SMRT Cell with a 30-h movie. Samples were demultiplexed using PacBio SMRT Link version 11. This produced 143,102 PacBio reads with a read N50 of 15,215bp.

Import

The PFAS 118.8 genome was imported using globus and unpacked using Unpack a Compressed File in Staging Area - v1.0.12

Unpack a compressed file in the staging area.
This app completed without errors in 45s.
Summary
Uploaded Files: 1 PFAS_Genomes/demultiplex.118.8-bc2053.hifi_reads.fastq
v1 - KBaseFile.SingleEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/169831

Assembly and Annotation

Assembly

The PFAS_118_8 reads were filtered by quality with Filter Reads with Filtlong -v0.2.1 using default parameters. The genome was assembled from long reads to completed contigs with Assemble Long Reads with Flye -v2.9.2 for PacBioHiFi reads (<1%error). The assembled genome contained 5.6 Mbp in 7 contigs with a G+C content of 56.04%. Contigs 1 and 2 are circularized chromsomes and contigs 3-7 are predicted plasmids. Plasmids and chromsonal prohages were identified with geNomad (7) using the NMDC EDGE Bioinformatics platform (8). Plasmids 3 and 6 contain conjugation genes. Three viral genomes were identified on the chromosomes. Chromsomes 1 and 2 each contained a medium quality-prediction provirus fragment and chromosome 2 also had a high quality-prediction provirus.

Quality control

Quality assesment was conducted witht Assess Genome Quality with CheckM -v1.0.18 using default parameters

Annotation

The PFAS_118_8 Genome was annotated with Annotate Genome/Assembly with RASTtk -v1.073 using default parameters

v1 - KBaseFile.SingleEndLibrary-2.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/169831
Assemble long reads using the Flye assembler.
This app completed without errors in 39m 22s.
Objects
Created Object Name Type Description
PFAS_118_8flye.contigs Assembly Assembled contigs
Summary
Flye results saved to: jgoff:narrative_1707234081078//kb/module/work/tmp/flye_0e6fe3d1-1acd-452b-a957-1de9d8eb7394 Assembly saved to: jgoff:narrative_1707234081078/PFAS_118_8flye.contigs Assembled into 7 contigs. Avg Length: 803207.1428571428 bp. Contig Length Distribution (# of contigs -- min to max basepairs): 5 -- 44596.0 to 333728.1 bp 0 -- 333728.1 to 622860.2 bp 0 -- 622860.2 to 911992.2999999999 bp 0 -- 911992.2999999999 to 1201124.4 bp 0 -- 1201124.4 to 1490256.5 bp 0 -- 1490256.5 to 1779388.5999999999 bp 0 -- 1779388.5999999999 to 2068520.6999999997 bp 1 -- 2068520.6999999997 to 2357652.8 bp 0 -- 2357652.8 to 2646784.9 bp 1 -- 2646784.9 to 2935917.0 bp
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/169831
  • flye_output.zip - Output file(s) generated by Flye
v1 - KBaseGenomeAnnotations.Assembly-5.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/169831
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 5m 8s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/169831
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM
Annotate or re-annotate genome/assembly using RASTtk (Rapid Annotations using Subsystems Technology toolkit).
This app completed without errors in 8m 23s.
Objects
Created Object Name Type Description
PFAS_118_8-annotations2 Genome RAST re-annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 7 contigs containing 5622450 nucleotides. No initial gene calls were provided. Standard features were called using: glimmer3; prodigal. A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr. The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity. In addition to the remaining original 0 coding features and 0 non-coding features, 5974 new features were called, of which 182 are non-coding. Output genome has the following feature types: Coding gene 5792 Non-coding repeat 113 Non-coding rna 69 The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Links
v1 - KBaseGenomes.Genome-11.1
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/169831

Taxonomic Identification

Taxonomic identification was preformed with Classify Microbes with GTDB-Tk -v1.7.0 and default parameters. A phylogenetic tree was created using the Insert Genome into SpeciesTree -v2.2.0 application. This identified PFAS_118.8 as Ochrobactrum anthropi a member of the Brucellaceae family.

Obtain objective taxonomic assignments for bacterial and archaeal genomes based on the Genome Taxonomy Database (GTDB) ver R06-RS202
This app completed without errors in 37m 44s.
Links
Add one or more Genomes to a KBase SpeciesTree.
This app completed without errors in 3m 29s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/169831
  • PFAS_118_8_output_tree.newick
  • PFAS_118_8_output_tree-labels.newick
  • PFAS_118_8_output_tree.png
  • PFAS_118_8_output_tree.pdf

Metabolic Modeling and Flux Balance Analysis

Genome-scale metabolic modling was accomplished with MS2 - Build Prokaryotic Metabolic Models and gap filled to identify biochemical reactions with Gapfill Metabolic Model using complete media. Metabolic flux balance analysis was completed with Run Flux Balance Analysis using complete media.

v1 - KBaseFBA.FBAModel-15.0
The viewer for the data in this Cell is available at the original Narrative here: https://narrative.kbase.us/narrative/169831
Identify the minimal set of biochemical reactions to add to a draft metabolic model to enable it to produce biomass in a specified media.This app is now obsolete, replaced by the new ModelSEED2 app: MS2 - Improved Gapfill Metabolic Models.
This app completed without errors in 45s.
Objects
Created Object Name Type Description
PFAS_118_8_MM_gapfilled FBAModel FBAModel-15 PFAS_118_8_MM_gapfilled
PFAS_118_8_MM_gapfilled.gf.0 FBA FBA-13 PFAS_118_8_MM_gapfilled.gf.0
Report
Output from Gapfill Metabolic Model
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/169831
Predict metabolite fluxes in a metabolic model of an organism grown on a given media using flux balance analysis (FBA).
This app completed without errors in 27s.
Objects
Created Object Name Type Description
PFAS118.8_FBA FBA FBA-13 PFAS118.8_FBA
Report
Summary
A flux balance analysis (FBA) was performed on the metabolic model 169831/47/1 growing in Complete media.
Output from Run Flux Balance Analysis
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/169831

References

(1)Gohil K, Rajput V, & Dharne M. 2020. Pan-genomics of Ochrobactrum species from clinical and environmental origins reveals distinct populations and possible links. Genomics, 112(5), 3003–3012.

(2)Bezza FA, Beukes M, & Chirwa EMN 2015. Application of biosurfactant produced by Ochrobactrum intermedium CN3 for enhancing petroleum sludge bioremediation. Process Biochemistry, 50(11), 1911–1922.

(3)Murínová S, & Dercová, K. 2014. Potential use of newly isolated bacterial strain Ochrobactrum anthropi in bioremediation of polychlorinated biphenyls. Water, Air, and Soil Pollution, 225(6).

(4)Wu Y, He T, Zhong M, Zhang Y, Li E, Huang T, & Hu Z. 2009. Isolation of marine benzo[a]pyrene-degrading Ochrobactrum sp. BAP5 and proteins characterization. Journal of Environmental Sciences, 21(10), 1446–1451.

(5)Kiledal E, Maresca JA. 2021. Chromosomal DNA extraction from Gram-positive bacteria. protocolsio.

(6)Frank JA, Reich CI, Sharma S, Weisbaum JS, Wilson BA, Olsen GJ. 2008. Critical evaluation of two primers commonly used for amplification of bacterial 16S rRNA genes. Applied and environmental microbiology 74, 2461-2470.

(7)Camargo AP, Roux S, Schulz F, Babinski M, Xu Y, Hu B, Chain PS, Nayfach S, Kyrpides NC. 2023. Identification of mobile genetic elements with geNomad. Nat Biotechnol:1-10.

(8)Eloe-Fadrosh EA, Ahmed F, Babinski M, Baumes J, Borkum M, Bramer L, Canon S, Christianson DS, Corilo YE, Davenport KW. 2022. The National Microbiome Data Collaborative Data Portal: an integrated multi-omics microbiome data resource. Nucleic Acids Res 50:D828-D836.

Released Apps

  1. Annotate Genome/Assembly with RASTtk - v1.073
    • [1] Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5. doi:10.1038/srep08365
    • [4] Kent WJ. BLAT The BLAST-Like Alignment Tool. Genome Res. 2002;12: 656 664. doi:10.1101/gr.229202
    • [5] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389-3402. doi:10.1093/nar/25.17.3389
    • [6] Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25: 955 964.
    • [7] Cobucci-Ponzano B, Rossi M, Moracci M. Translational recoding in archaea. Extremophiles. 2012;16: 793 803. doi:10.1007/s00792-012-0482-8
    • [8] Meyer F, Overbeek R, Rodriguez A. FIGfams: yet another set of protein families. Nucleic Acids Res. 2009;37 6643-54. doi:10.1093/nar/gkp698.
    • [9] van Belkum A, Sluijuter M, de Groot R, Verbrugh H, Hermans PW. Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains. J Clin Microbiol. 1996;34: 1176 1179.
    • [10] Croucher NJ, Vernikos GS, Parkhill J, Bentley SD. Identification, variation and transcription of pneumococcal repeat sequences. BMC Genomics. 2011;12: 120. doi:10.1186/1471-2164-12-120
    • [11] Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. doi:10.1186/1471-2105-11-119
    • [12] Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23: 673 679. doi:10.1093/bioinformatics/btm009
    • [13] Akhter S, Aziz RK, Edwards RA. PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Res. 2012;40: e126. doi:10.1093/nar/gks406
  2. Assess Genome Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  3. Gapfill Metabolic Model
    • [1] Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28: 977 982. doi:10.1038/nbt.1672
    • [2] Henry CS, Jankowski MD, Broadbelt LJ, Hatzimanikatis V. Genome-Scale Thermodynamic Analysis of Escherichia coli Metabolism. Biophysical Journal. 2006;90: 1453 1461. doi:10.1529/biophysj.105.071720
    • [3] Jankowski MD, Henry CS, Broadbelt LJ, Hatzimanikatis V. Group Contribution Method for Thermodynamic Analysis of Complex Metabolic Networks. Biophysical Journal. 2008;95: 1487 1499. doi:10.1529/biophysj.107.124784
    • [4] Henry CS, Zinner JF, Cohoon MP, Stevens RL. iBsu1103: a new genome-scale metabolic model of Bacillus subtilisbased on SEED annotations. Genome Biology. 2009;10: R69. doi:10.1186/gb-2009-10-6-r69
    • [5] Orth JD, Thiele I, Palsson B . What is flux balance analysis? Nature Biotechnology. 2010;28: 245 248. doi:10.1038/nbt.1614
    • [6] Latendresse M. Efficiently gap-filling reaction networks. BMC Bioinformatics. 2014;15: 225. doi:10.1186/1471-2105-15-225
    • [7] Dreyfuss JM, Zucker JD, Hood HM, Ocasio LR, Sachs MS, Galagan JE. Reconstruction and Validation of a Genome-Scale Metabolic Model for the Filamentous Fungus Neurospora crassa Using FARM. PLOS Computational Biology. 2013;9: e1003126. doi:10.1371/journal.pcbi.1003126
  4. Insert Genome Into SpeciesTree - v2.2.0
    • Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490
  5. Run Flux Balance Analysis
    • Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28: 977 982. doi:10.1038/nbt.1672
    • Orth JD, Thiele I, Palsson B . What is flux balance analysis? Nature Biotechnology. 2010;28: 245 248. doi:10.1038/nbt.1614
  6. Unpack a Compressed File in Staging Area - v1.0.12
    • Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163

Apps in Beta

  1. Assemble Long Reads with Flye - v2.9.2
    • [1] Mikhail Kolmogorov, Jeffrey Yuan, Yu Lin and Pavel Pevzner, "Assembly of Long Error-Prone Reads Using Repeat Graphs", Nature Biotechnology, 2019 doi:10.1038/s41587-019-0072-8
  2. Classify Microbes with GTDB-Tk - v1.7.0
    • Pierre-Alain Chaumeil, Aaron J Mussig, Philip Hugenholtz, Donovan H Parks, GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database, Bioinformatics, Volume 36, Issue 6, 15 March 2020, Pages 1925 1927. DOI: https://doi.org/10.1093/bioinformatics/btz848
    • Parks, D., Chuvochina, M., Waite, D. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36, 996 1004 (2018). DOI: https://doi.org/10.1038/nbt.4229
    • Parks DH, Chuvochina M, Chaumeil PA, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;10.1038/s41587-020-0501-8. DOI:10.1038/s41587-020-0501-8
    • Rinke C, Chuvochina M, Mussig AJ, Chaumeil PA, Dav n AA, Waite DW, Whitman WB, Parks DH, and Hugenholtz P. A standardized archaeal taxonomy for the Genome Taxonomy Database. Nat Microbiol. 2021 Jul;6(7):946-959. DOI:10.1038/s41564-021-00918-8
    • Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 2010;11:538. Published 2010 Oct 30. doi:10.1186/1471-2105-11-538
    • Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9(1):5114. Published 2018 Nov 30. DOI:10.1038/s41467-018-07641-9
    • Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. Published 2010 Mar 8. DOI:10.1186/1471-2105-11-119
    • Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3):e9490. Published 2010 Mar 10. DOI:10.1371/journal.pone.0009490 link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835736/
    • Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011;7(10):e1002195. DOI:10.1371/journal.pcbi.1002195