Complete genome sequence of Bradyrhizobium NP1, isolated from forest soil

Introduction

We report the complete genome sequence of Bradyrhizobium strain NP1, which was isolated from forest soil that had been subject to chronic warming. The diverse genus Bradyrhizobium is predicted to contain approximately 800 species (1) and includes non-symbiotic species that dominate forest soil (2). Bradyrhizobium NP1 was isolated from a Long-Term Ecological Research site in the Harvard Forest (HRF), in Petersham, MA (42.54, -72.18).

Authors: Trevor Fisher^a, Francesca Durmazolu^a, Kristen M. DeAngelis^b, Maureen A. Morrow^a

Affiliations

^a Department of Biology, State University of New York at New Paltz, New Paltz, New York, USA ^b Department of Microbiology, University of Massachusetts, Amherst, Massachusetts, USA

Experimental Methods

We report the complete genome sequence of Bradyrhizobium strain NP1. This bacterium was isolated from forest soil that had been subject to chronic warming. The genome of this novel isolated bacteria is presented as a single circular contig of 7,712,921 base pairs with 64.14% GC content.

Sample Collection

Bradyrhizobium NP1 was isolated from the Prospect Hill long-term warming experiment (3) at the Harvard Forest Long-Term Ecological Research site, in Petersham, MA (42.54, -72.18). In May, 2021, organic horizon soil was collected from a heated plot and stored at 4°C until use.

Isolation

In August 2021, soil was plated on dilute nutrient broth supplemented with ammonium nitrate (Table 1); and incubated at room temperature (22°C). NP1 was a slow growing colony that appeared after 10 weeks, and thus was chosen for analysis.

Table 1	Isolation Medium
Ingredient	Amount Per Liter
Difco^TM Nutrient Broth	0.08g
NH₄NO₃	0.50g
1M CaCl	0.60ml
Agar	6.00g
Gellan Gum	6.40g
Cyclohexamide	50.0g

16S sequence analysis

NP1 was identified as a Bradyrhizobuim using the online NCBI BLASTn tool (standard database) with a 16S rRNA gene sequence amplified by PCR (universal primers 27F and 1492R) from genomic DNA extracted with the Quick-DNA Fecal/Soil Microbe Miniprep Kit (Zymo, Irvine, CA) (GenBank accession number OR045828). The sequence has 99.6 % identity with at least 100 species of Bradyrhizobium.

Genome Sequencing

NP1 was streaked to purity and grown in 10% Tryptic Soy broth (Becton, Dickinson and Company, Sparks, MD) with 1X MEM Vitamin Solution (Gibco, Grand Island, NY) at 30°C for 2 days with shaking. Genomic DNA was extracted with the DNeasy Blood and Tissue kit using a 1-hour lysozyme pre-treatment. (Qiagen, Hilden, Germany). The DNA was quantified using a Qubit 4.0 fluorometer (ds DNA HS Assay, Invitrogen, Waltham, MA).

Whole-genome sequencing (WGS) was performed using the Illumina DNA Prep kit and IDT 10bp UDI indices on an Illumina NextSeq 2000 (2x151bp reads) by SeqCenter (Pittsburgh, PA). Demultiplexing, quality control and adapter trimming was performed with the proprietary bcl-convert (v3.9.30), resulting in 7,223,840 reads. The reads were trimmed with Trimmomatic (v0.36)(4), in the DOE Systems Biology Knowledgebase (KBase) platform (5) using default parameters. The resultant 7,115,010 reads had an average read length of 145.82 ± 17.45 (134X coverage).

The same DNA sample was sequenced at Plasmidsaurus (Eugene, OR). The library was constructed with the Oxford Nanopore Technologies Ligation Sequencing Kit version SQK-LSK115 and was sequenced on GridION 10.4.1 flowcells (FLO-MIN114) using the “Super accuracy” basecaller in MinKNOW. The reads were filtered with Filtlong (v.0.2.10, https://github.com/rrwick/Filtlong) in KBase (5) to remove reads <1000 nucleotides and 5% of the lowest quality. A total of 52,003 reads were obtained (average read length, 8030.49 ± 6726.47, 54X coverage).

Assembly, and Annotation

Assembly

A hybrid assembly was generated with Unicyclyer (v0.4.8)(5), with rotation between multiple rounds of polishing. Assembly quality was assessed with QUAST(v4.4)and CheckM (v1.0.18)(7).
The assembled sequence was annotated with RASTtk (v1.073)(7)

The assembly resulted in a single circular contig of 7,712,921 base pairs (N₅₀=7,712,921). The 7,712,921 base pair genome has 64.14% GC content. See QUAST report

CheckM showed 99.98% completeness and 1.01% estimated contamination. See CheckM report

The assembled sequence was annotated with RASTtk (v1.073)(8) and is predicted to encode 7,808 proteins.

Accession numbers

The 16S rRNA gene sequence is available under GenBank accession number OR045828.
The assembled genome sequence was deposited in GenBank under the accession number CP127385.
The raw sequence reads are available under BioProject PRJNA975924 and SRA SRX20568407 (Illumina reads) and SRX20568406 (Nanopore reads).

Taxonomic Identification

The initial classification was done by 16S rRNA alignment in BLASTn with the PCR product having a average quality score of 42.5 (4Peaks, v1.8, GenBank acession number:OR045828).
The alignment results produced a 99.6% match with at least 100 strains of Bradyrhizobium (see table). We also built species trees using KBase apps that employ a set of phylogenetic marker genes other than the 16S rRNA:GTDB-Tk and Insert Genome into Species Tree, using default parameters (see below). No species match was produced.

Table: 16S BLASTn result 5/26/23:

References

Ormeño-Orrillo E, Martínez-Romero E. 2019. A Genomotaxonomy View of the Bradyrhizobium Genus. Frontiers in Microbiology 10.
VanInsberghe D, Maas KR, Cardenas E, Strachan CR, Hallam SJ, Mohn WW. 2015. Non-symbiotic Bradyrhizobium ecotypes dominate North American forest soils. The ISME Journal 9:2435–2441.
Melillo JM, Frey SD, DeAngelis KM, Werner WJ, Bernard MJ, Bowles FP, Pold G, Knorr MA, Grandy AS. 2017. Long-term pattern and magnitude of soil carbon feedback to the climate system in a warming world. Science 358:101–105.
Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120.
Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, Dehal P, Ware D, Perez F, Canon S, Sneddon MW, Henderson ML, Riehl WJ, Murphy-Olson D, Chan SY, Kamimura RT, Kumari S, Drake MM, Brettin TS, Glass EM, Chivian D, Gunter D, Weston DJ, Allen BH, Baumohl J, Best AA, Bowen B, Brenner SE, Bun CC, Chandonia J-M, Chia J-M, Colasanti R, Conrad N, Davis JJ, Davison BH, DeJongh M, Devoid S, Dietrich E, Dubchak I, Edirisinghe JN, Fang G, Faria JP, Frybarger PM, Gerlach W, Gerstein M, Greiner A, Gurtowski J, Haun HL, He F, Jain R, Joachimiak MP, Keegan KP, Kondo S, Kumar V, Land ML, Meyer F, Mills M, Novichkov PS, Oh T, Olsen GJ, Olson R, Parrello B, Pasternak S, Pearson E, Poon SS, Price GA, Ramakrishnan S, Ranjan P, Ronald PC, Schatz MC, Seaver SMD, Shukla M, Sutormin RA, Syed MH, Thomason J, Tintle NL, Wang D, Xia F, Yoo H, Yoo S, Yu D. 2018. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology 36:566–569.
Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595.
Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2015. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055.
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O. 2008. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics 9:75.
Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database.