Generated August 29, 2023

Five draft genome assemblies from Bacillaceae isolated from a degraded wetland environment

Abstract

We isolated 5 Bacillaceae from a degraded wetland environment and sequenced their genomes using Illumina NextSeq. Here, we report draft genome sequences of Bacillus velezensus-SC119, Priestia megaterium-SC120, Bacillus zhangzhouensis-SC123, Bacillus pumilis-SC124, and Bacillus idriensis-SC127. The genomes range between 3,657,353 and 5,772,725 base pairs with %GC between 37.62% and 46.38%.

Introduction

Wetland environments play critical roles in the terrestrial carbon and water cycles and microbial communities are key players in healthy ecosystem function. Endospore forming bacteria in the Bacillaceae family are metabolically and genomically diverse soil heterotrophs that influence plant health, carbon and nitrogen cycling, and often produce diverse natural products that influence other bacterial and non-bacterial species in their environment (1). We collected two soil samples on January 19, 2023 from 42°43'12.7"N 73°45'01.4"W. One was highly hydrated and within a patch of invasive common reeds (Phragmites sp.) and the other was near the base of an Eastern cottonwood tree (Populus deltoides). Bacillus pumilis strain SC124 was isolated from the soil from near the cottonwood tree, while Bacillus velezensis strain SC119, Priestia megaterium strain SC120, Bacillus zhangzhouensis strain SC123, Bacillus idriensis strain SC127 and were isolated from the marshy soil.

The publication by Anna L. McLoon, Prince Ackaah Asante, Thomas Anderson, Kellyanne Cahill, Delana Cochrane, Keira Cohen, Jaylene German, Christian Hrubes, Isabella LaCroix, Killian McNamee, Anna Mossakowski, Aidan M. Nichter, Jessica L. Pepe and Andrew T. Schofield can be found here: [URL will be available after publication]

Table of Contents

  1. Background and Experimental Methods
  2. Import and annotation
  3. QC, Assembly, Annotation, and Taxonomic Classification
  4. References
Narratives and data created by: Anna L. McLoon, Prince Ackaah Asante, Thomas Anderson, Kellyanne Cahill, Delana Cochrane, Keira Cohen, Jaylene German, Christian Hrubes, Isabella LaCroix, Killian McNamee, Anna Mossakowski, Aidan M. Nichter, Jessica L. Pepe and Andrew T. Schofield

Background and Experimental Methods

</form>

Sample Collection and Isolation

Unique bacterial strains were isolated as described previously by boiling soil subsamples for 10 minutes in water and plating a portion on TSA +5% sheep blood for 24 hours at 37 °C (2). All strains grow and remain viable at temperatures between 22 °C and 37 °C. After colony purification and basic characterization, genomic DNA was isolated and sequenced as described previously (2) using a Promega DNA wizard kit from cultures grown for approximately 3 hrs at 37 °C in Tryptic soy broth.

Genome Sequencing

Libraries were prepared and sequenced as 151-bp paired-end reads using an Illumina NextSeq2000 instrument by SeqCenter who prepared libraries using the Illumina DNA Prep kit and IDT 10bp UDI indices (Pittsburgh, PA). Demultiplexing, quality control, and adapter trimming was performed with bcl-convert v3.9.3 (Illumina). Sequence reads were imported into the KBase environment for analysis, with each genome analysis occurring in parallel in its own narrative (3, 4). Read quality was checked with FastQC v0.11.09, reads were trimmed with Trimmomatic v0.36, assembled de novo using SPAdes v3.15.3, annotated using RASTtk v1.073 and Prokka v1.14.5, probable species identities were determined using both GTDB-Tk v1.7.0 and TYGS, which were in agreement in all cases, and metabolic predictions were made using DRAM v0.1.2 (5–18). Default parameters were used when running all programs.

This scaffolding narrative was created by: Anna McLoon

Access to Narrative workflows and sequence data

Strain Kbase narrative used for analyses Bioproject accession number SRA Biosample GenBank accession
Bacillus velezensis strain SC119

https://kbase.us/n/139243/37/

PRJNA862062 SAMN######## genbankID
Priestia megaterium strain SC120

https://kbase.us/n/139247/20/

PRJNA862062 SAMN######## genbankID
Bacillus zhangzhouensis strain SC123

https://kbase.us/n/139245/53/

PRJNA862062 SAMN######## genbankID
Bacillus pumilis strain SC124

https://kbase.us/n/139244/41/

PRJNA862062 SAMN######## genbankID
Bacillus idriensis strain SC127

https://kbase.us/n/139250/65/

PRJNA862062 SAMN######## genbankID

QC, Assembly, Annotation and Taxonomic Classification

Sequence reads were imported into the KBase environment for analysis, with each genome analysis occurring in parallel in its own narrative (3, 4). Read quality was checked with FastQC v0.11.09, reads were trimmed with Trimmomatic v0.36, assembled de novo using SPAdes v3.15.3, annotated using RASTtk v1.073 and Prokka v1.14.5, probable species identities were determined using both GTDB-Tk v1.7.0 and TYGS, which were in agreement in all cases, and metabolic predictions were made using DRAM v0.1.2 (5–18). Default parameters were used when running all programs.

The 5 genomes were placed into a genome tree using SpeciesTreeBuilder v 0.1.3 (19).

 

Table 1: Data summary

Strain

Species

Number of reads

# contigs

Total length (bp)

N50

%GC

Predicted genes (prokka via KBase)

SC119

Bacillus velezensis

7,568,064

23

3,947,238

578,987

46.38

3,875

SC120

Priestia megaterium

8,044,838

27

5,772,725

4,216,129

37.62

6,010

SC123

Bacillus zhangzhouensis

6,292,516

22

3,657,353

339,228

41.43

3,713

SC124

Bacillus pumilis

6,681,492

21

3,837,327

824,075

41.33

3,981

SC127

Bacillus idriensis

6,671,362

15

4,483,053

675,715

41.01

4,498

 

Add a user-provided GenomeSet to a KBase SpeciesTree.
This app completed without errors in 4m 18s.
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/152701
  • 5_genome_tree.newick
  • 5_genome_tree-labels.newick
  • 5_genome_tree.png
  • 5_genome_tree.pdf

References

  1. Mandic-Mulec I, Stefanic P, van Elsas JD. 2015. Ecology of Bacillaceae. Microbiol Spectr 3:TBS-0017-2013.

  2. McLoon AL, Awad TT, Bogardus MF, Buono MG, Devine KA, Draper RM, Femenella B, Gallagher HM, Morelock LA, Razi M, Rennick JR, Sheridan AK, Thibault RJ, Touchette KL, Zuchowski GE. 2022. Draft Genome Sequences for 6 Isolates of Endospore-Forming Class Bacilli Species Isolated from Soil from a Suburban, Wooded, Developed Space. Microbiol Resour Announc 11:e0087422.

  3. Allen B, Drake M, Harris N, Sullivan T. 2017. Using KBase to Assemble and Annotate Prokaryotic Genomes. Curr Protoc Microbiol 46:1E.13.1-1E.13.18.

  4. Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, Dehal P, Ware D, Perez F, Canon S, Sneddon MW, Henderson ML, Riehl WJ, Murphy-Olson D, Chan SY, Kamimura RT, Kumari S, Drake MM, Brettin TS, Glass EM, Chivian D, Gunter D, Weston DJ, Allen BH, Baumohl J, Best AA, Bowen B, Brenner SE, Bun CC, Chandonia J-M, Chia J-M, Colasanti R, Conrad N, Davis JJ, Davison BH, DeJongh M, Devoid S, Dietrich E, Dubchak I, Edirisinghe JN, Fang G, Faria JP, Frybarger PM, Gerlach W, Gerstein M, Greiner A, Gurtowski J, Haun HL, He F, Jain R, Joachimiak MP, Keegan KP, Kondo S, Kumar V, Land ML, Meyer F, Mills M, Novichkov PS, Oh T, Olsen GJ, Olson R, Parrello B, Pasternak S, Pearson E, Poon SS, Price GA, Ramakrishnan S, Ranjan P, Ronald PC, Schatz MC, Seaver SMD, Shukla M, Sutormin RA, Syed MH, Thomason J, Tintle NL, Wang D, Xia F, Yoo H, Yoo S, Yu D. 2018. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nat Biotechnol 36:566–569.

  5. Andrews, Simon. 2019. FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (0.11.9).

  6. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120.

  7. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477.

  8. Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. 2020. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics 70:e102.

  9. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O. 2008. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75.

  10. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V, Wattam AR, Xia F, Stevens R. 2014. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 42:D206-214.

  11. Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, Olson R, Overbeek R, Parrello B, Pusch GD, Shukla M, Thomason JA, Stevens R, Vonstein V, Wattam AR, Xia F. 2015. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep 5:8365.

  12. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069.

  13. Meier-Kolthoff JP, Göker M. 2019. TYGS is an automated high-throughput platform for state-of-the-art genome-based taxonomy. Nat Commun 10:2182.

  14. Meier-Kolthoff JP, Carbasse JS, Peinado-Olarte RL, Göker M. 2022. TYGS and LPSN: a database tandem for fast and reliable genome-based classification and nomenclature of prokaryotes. Nucleic Acids Res 50:D801–D807.

  15. Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. 2019. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics btz848.

  16. Shaffer M, Borton MA, McGivern BB, Zayed AA, La Rosa SL, Solden LM, Liu P, Narrowe AB, Rodríguez-Ramos J, Bolduc B, Gazitúa MC, Daly RA, Smith GJ, Vik DR, Pope PB, Sullivan MB, Roux S, Wrighton KC. 2020. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res 48:8883–8900.

  17. Parks DH, Chuvochina M, Waite DW, Rinke C, Skarshewski A, Chaumeil P-A, Hugenholtz P. 2018. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36:996–1004.

  18. Parks DH, Chuvochina M, Chaumeil P-A, Rinke C, Mussig AJ, Hugenholtz P. 2020. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol 38:1079–1086.

  19. Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490

Apps

  1. Insert Set of Genomes Into SpeciesTree - v2.2.0
    • Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490