The publication by Anna L. McLoon, Prince Ackaah Asante, Thomas Anderson, Kellyanne Cahill, Delana Cochrane, Keira Cohen, Jaylene German, Christian Hrubes, Isabella LaCroix, Killian McNamee, Anna Mossakowski, Aidan M. Nichter, Jessica L. Pepe and Andrew T. Schofield can be found here: [URL will be available after publication]
This scaffolding narrative was created by: Anna McLoon
Strain | Kbase narrative used for analyses | Bioproject accession number | SRA Biosample | GenBank accession |
Bacillus velezensis strain SC119 | PRJNA862062 | SAMN37195270 | JAVIJC000000000 | |
Priestia megaterium strain SC120 | PRJNA862062 | SAMN37195271 | JAVIJB000000000 | |
Bacillus zhangzhouensis strain SC123 | PRJNA862062 | SAMN37195272 | JAVIJA000000000 | |
Bacillus pumilis strain SC124 | PRJNA862062 | SAMN37195273 | JAVIJZ000000000 | |
Bacillus idriensis strain SC127 | PRJNA862062 | SAMN37195274 | JAVIJY000000000 |
Sequence reads were imported into the KBase environment for analysis, with each genome analysis occurring in parallel in its own narrative (3, 4). Read quality was checked with FastQC v0.11.09, reads were trimmed with Trimmomatic v0.36, assembled de novo using SPAdes v3.15.3, annotated using RASTtk v1.073 and Prokka v1.14.5, probable species identities were determined using both GTDB-Tk v1.7.0 and TYGS, which were in agreement in all cases, and metabolic predictions were made using DRAM v0.1.2 (5–18). Default parameters were used when running all programs.
The 5 genomes were placed into a genome tree using SpeciesTreeBuilder v 0.1.3 (19).
Table 1: Data summary
Strain |
Species |
Number of reads |
# contigs |
Total length (bp) |
N50 |
%GC |
Predicted genes (prokka via KBase) |
SC119 |
Bacillus velezensis |
7,568,064 |
23 |
3,947,238 |
578,987 |
46.38 |
3,875 |
SC120 |
Priestia megaterium |
8,044,838 |
27 |
5,772,725 |
4,216,129 |
37.62 |
6,010 |
SC123 |
Bacillus zhangzhouensis |
6,292,516 |
22 |
3,657,353 |
339,228 |
41.43 |
3,713 |
SC124 |
Bacillus pumilis |
6,681,492 |
21 |
3,837,327 |
824,075 |
41.33 |
3,981 |
SC127 |
Bacillus idriensis |
6,671,362 |
15 |
4,483,053 |
675,715 |
41.01 |
4,498 |
Mandic-Mulec I, Stefanic P, van Elsas JD. 2015. Ecology of Bacillaceae. Microbiol Spectr 3:TBS-0017-2013.
McLoon AL, Awad TT, Bogardus MF, Buono MG, Devine KA, Draper RM, Femenella B, Gallagher HM, Morelock LA, Razi M, Rennick JR, Sheridan AK, Thibault RJ, Touchette KL, Zuchowski GE. 2022. Draft Genome Sequences for 6 Isolates of Endospore-Forming Class Bacilli Species Isolated from Soil from a Suburban, Wooded, Developed Space. Microbiol Resour Announc 11:e0087422.
Allen B, Drake M, Harris N, Sullivan T. 2017. Using KBase to Assemble and Annotate Prokaryotic Genomes. Curr Protoc Microbiol 46:1E.13.1-1E.13.18.
Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, Dehal P, Ware D, Perez F, Canon S, Sneddon MW, Henderson ML, Riehl WJ, Murphy-Olson D, Chan SY, Kamimura RT, Kumari S, Drake MM, Brettin TS, Glass EM, Chivian D, Gunter D, Weston DJ, Allen BH, Baumohl J, Best AA, Bowen B, Brenner SE, Bun CC, Chandonia J-M, Chia J-M, Colasanti R, Conrad N, Davis JJ, Davison BH, DeJongh M, Devoid S, Dietrich E, Dubchak I, Edirisinghe JN, Fang G, Faria JP, Frybarger PM, Gerlach W, Gerstein M, Greiner A, Gurtowski J, Haun HL, He F, Jain R, Joachimiak MP, Keegan KP, Kondo S, Kumar V, Land ML, Meyer F, Mills M, Novichkov PS, Oh T, Olsen GJ, Olson R, Parrello B, Pasternak S, Pearson E, Poon SS, Price GA, Ramakrishnan S, Ranjan P, Ronald PC, Schatz MC, Seaver SMD, Shukla M, Sutormin RA, Syed MH, Thomason J, Tintle NL, Wang D, Xia F, Yoo H, Yoo S, Yu D. 2018. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nat Biotechnol 36:566–569.
Andrews, Simon. 2019. FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (0.11.9).
Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120.
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477.
Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. 2020. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics 70:e102.
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O. 2008. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75.
Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V, Wattam AR, Xia F, Stevens R. 2014. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 42:D206-214.
Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, Olson R, Overbeek R, Parrello B, Pusch GD, Shukla M, Thomason JA, Stevens R, Vonstein V, Wattam AR, Xia F. 2015. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep 5:8365.
Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069.
Meier-Kolthoff JP, Göker M. 2019. TYGS is an automated high-throughput platform for state-of-the-art genome-based taxonomy. Nat Commun 10:2182.
Meier-Kolthoff JP, Carbasse JS, Peinado-Olarte RL, Göker M. 2022. TYGS and LPSN: a database tandem for fast and reliable genome-based classification and nomenclature of prokaryotes. Nucleic Acids Res 50:D801–D807.
Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. 2019. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics btz848.
Shaffer M, Borton MA, McGivern BB, Zayed AA, La Rosa SL, Solden LM, Liu P, Narrowe AB, Rodríguez-Ramos J, Bolduc B, Gazitúa MC, Daly RA, Smith GJ, Vik DR, Pope PB, Sullivan MB, Roux S, Wrighton KC. 2020. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res 48:8883–8900.
Parks DH, Chuvochina M, Waite DW, Rinke C, Skarshewski A, Chaumeil P-A, Hugenholtz P. 2018. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36:996–1004.
Parks DH, Chuvochina M, Chaumeil P-A, Rinke C, Mussig AJ, Hugenholtz P. 2020. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol 38:1079–1086.
Price MN, Dehal PS, Arkin AP. FastTree 2 Approximately Maximum-Likelihood Trees for Large Alignments. PLoS One. 2010;5. doi:10.1371/journal.pone.0009490