Complete Genome Sequence of Bacillus cereus strain CPT56D-587-MTF Isolated from a Nitrate and Metals Contaminated Subsurface Environment

Introduction

Bacillus cereus strain CPT56D-587-MTF (also referred to as CPTF) was isolated from a nitrate and heavy-metal contaminated site at the Oak Ridge Field Research Center (ORFRC) in Oak Ridge, Tennessee, USA (1). Strain CPTF has 100% 16S rRNA gene identity to the most abundant metagenome 16S v4 region amplicon sequence variant (ASV) in the soils of the highly contaminated Area 3, the site immediately adjacent to the contamination source: the former S-3 ponds.

Publication

Goff, Jennifer L., Lauren M. Lui, Torben N. Nielsen, Michael P. Thorgersen, Elizabeth G. Szink, John-Marc Chandonia, Farris L. Poole, Jizhong Zhou, Terry C. Hazen, Adam P. Arkin, and Michael W. W. Adams. "Complete Genome Sequence of Bacillus cereus Strain CPT56D-587-MTF, Isolated from a Nitrate- and Metal-Contaminated Subsurface Environment." Microbiology Resource Announcements (e00145-22). https://dx.doi.org/doi:10.1128/mra.00145-22.

External Data Availability

The whole-genome sequencing project has been deposited in GenBank under the accession number GCA_021391515.1 The raw sequence reads have been deposited in the SRA under the accession number PRJNA791653

Background and Experimental Methods

Sample Collection

A soil sample (long. -84.27335º, lat. 35.977268º, depth 535.94 cm) was collected from ORR Area 3 which lies immediately adjacent to the former S-3 ponds. The sampling was done in October 2020. Samples were stored at -20°C until ready for use.

Isolation

Initial enrichment was carried out anoxically in R2A medium amended with 10 mM nitrite and 100 mM KH2PO4 with the pH adjusted to 5.5 and inoculated with ~1 g of soil.
Following a week of incubation at room temperature, isolates were obtained by streak-plating onto LB agar plates. Isolated colonies were selected for further characterization.
Purity of the isolate was confirmed by gram-staining and microscopy, streak plating and observation of colony morphology, and 16S Sanger sequencing.

Genome Sequencing

To obtain a cell pellet for genomic DNA extraction, strain CPTF was grown in R2A media at 30 ºC shaking at 200 RPM for ~24 hours.
For the first round of digestion, the cell pellet was resuspended in 750 µL of PBS, 25 µL of MetaPolyzyme (Sigma-Aldrich) and 25 µL of Qiagen lytic enzyme solution and incubated at 37 ºC for 30 minutes.
The second round of digestion was performed in 167 µL of 6X Qiagen Buffer B1 (300 mM Tris-Cl pH 8.0, 300 mM EDTA pH 8.0, 3% Tween 20, 3% Triton-X100), 35 µL Proteinase K, and 2µL RNAse A with incubation at 50 ºC for 30 minutes at 50 RPM.
The lysate was processed with the Genomic-Tip 20/G kit (Qiagen) according to the manufacturer’s directions. The presence of high molecular weight (HMW) DNA was confirmed by running the DNA on a 0.5% agarose gel with Quick-Load 1kb Extend DNA ladder (New England BioLabs).
The HMW DNA was prepared for nanopore sequencing. End-repair was performed using the NEBNext® Companion Module for Oxford Nanopore Technologies® Ligation Sequencing (New England BioLabs) according to manufacturer’s instructions. The Native Barcoding Expansion (EXP-NBD104, Oxford Nanopore Technologies) and Ligation Sequencing kit s(LSK-SQK109 kit, Oxford Nanopore Technologies) were used for barcoding and adapter ligation.
The HMW DNA was prepared for Illumina library creation by needle shearing. The Illumina library was made using the Illumina DNA prep kit according to manufacturer’s instructions.
The nanopore library was sequenced on a R9.4.1 flow cell on a MinION device (Oxford Nanopore Technologies). The Illumina library was sequenced using 2x150bp reads on a NovaSeq 6000 by Novogene.

Genome Assembly

QC and Assembly

For Illumina data, adapters were removed by Novogene in-house and then further processed using BBTools (https://jgi.doe.gov/data-and-tools/bbtools) for trimming and quality filtering as described in Lui et. al. (2021) (2).
Nanopore base calling, adapter removal, demultiplexing, and quality filtering were performed with Guppy v4.0.
The genome was assembled using the nanopore and Illumina reads as inputs to the hybrid assembler Unicycler v0.4.8 (3) using default parameters.
Unicycler logs were checked to confirm that the assembly passed quality thresholds and that the DNA elements were circularized.

Genome Statistics

The completed genome contains 6,548,342 bp in 9 contigs with a G+C content of 35.37%. Contig 1 is the circularized chromosome. Contigs 2-9 are predicted plasmids.

Import and Annotation

The CPTF genome assembly was imported into KBase using the default parameters in the Import FASTA File as Assembly from Staging Area application.
The genome was annotated in KBase using the Annotate Microbial Assembly application which is based on RASTtk v1.073 and using default parameters.
The circular genome was visualized using the KBase Circular Genome Visualization Tool.
The quality of the genome was assessed using the KBase Assess Genome Quality with CheckM-v1.0.18 application.

Taxonomic Identification

Taxonomic identification was performed using the KBase Classify Microbes with GTDB-Tk-v1.7.0 application.
A phylogenetic tree was constructed using the Insert Genome Into Species Tree v2.2.0 application.

Metabolic Modeling and Flux Balance Analysis

A draft metabolic model based on the annotated genome was constructed using the KBase Build Metabolic Model v2.0.0 application.
A Flux Balance Analysis using the the draft metabolic model was performed using the KBase Run Flux Balance Analysis v2.0.0 application.

References

Brooks SC. 2001. Waste characteristics of the former S-3 ponds and outline of uranium chemistry relevant to NABIR Field Research Center studies. NABIR Field Research Center, Oak Ridge, Tenn. doi: 10.2172/814525
Lui LM, Nielsen TN, Arkin AP. 2021. A method for achieving complete microbial genomes and improving bins from metagenomics data. PLoS Comput Biol 17:e1008972. doi: 10.1371/journal.pcbi.1008972
Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595

Created Object Name	Type	Description
CPT56D-587-MTF_metabolic_model	FBAModel	FBAModel-14 CPT56D-587-MTF_metabolic_model
CPT56D-587-MTF_metabolic_model.gf.1	FBA	FBA-13 CPT56D-587-MTF_metabolic_model.gf.1