Generated July 8, 2020

Complete Genome Sequence of Starkeya sp. Strain ORNL1, a Soil Alphaproteobacterium Isolated from the Rhizosphere of Populus deltoides

Introduction

This narrative was used for a complete genome sequence of a Starkeya species, an alphaproteobacterium. Other members of the genus have been identified in soil samples and rice straw1-4. This publication sequences an example isolated from a rhizosphere of a Populus deltoides tree in Oak Ridge, TN.

The publication by Mircea Podar, Joel Turner, Leah H. Burdick, and Dale A. Pelletier can be found here: https://mra.asm.org/content/9/27/e00644-20

Table of Contents

  1. Prior Methods
  2. Import and Annotation
  3. Compute ANI with FastANI
  4. Metabolic Modelling
  5. References

Narrative created by Mircea Podar, edited by Zachary Crockett

Prior Methods

Sample Collection and Isolation

A sample was collected from a soil rhizosphere of an Eastern cottonwood tree (Populus deltoides) in Oak Ridge, Tennessee. Flow cytometry was used to isolated single cells which were grown at 28°C. Colonies were identified using small-subunit rRNA gene sequencing and close relatives identified with MegaBLAST5. The colony shown here was identified as having 98% sequence identity with Starkeya novella and Starkeya koreensis.

Sequencing

The strain was grown for 2 days at 30°C. Genomic DNA was isolated, sequenced on a PacBio instrument, and then filtred based on quality and assembled using the PacBio SMRTLink v7.0 pipeline6. The assembled reads were imported into KBase for analysis, as shown below.

Import and Annotation

The assembled reads were imported into KBase and annotated using RASTtk through the Annotate Microbial Assembly App.

An assembly and Genbank genome of the close relative, S. novella, were also imported for comparison.

CheckM was run on the assembled reads to verify quality.

Import a FASTA file from your staging area into your Narrative as an Assembly data object
This app completed without errors in 50s.
Objects
Created Object Name Type Description
Starkeya.fasta_assembly Assembly Imported Assembly
Links
Annotate a bacterial or archaeal assembly using components from the RAST (Rapid Annotations using Subsystems Technology) toolkit (RASTtk).
This app completed without errors in 7m 1s.
Objects
Created Object Name Type Description
Starkeya_rhizosphaerae Genome Annotated genome
Summary
The RAST algorithm was applied to annotating a genome sequence comprised of 1 contigs containing 6286191 nucleotides. 
No initial gene calls were provided.
Standard features were called using: glimmer3; prodigal.
A scan was conducted for the following additional feature types: rRNA; tRNA; selenoproteins; pyrrolysoproteins; repeat regions; crispr.
The genome features were functionally annotated using the following algorithm(s): Kmers V2; Kmers V1; protein similarity.
In addition to the remaining original 0 coding features and 0 non-coding features, 6254 new features were called, of which 199 are non-coding.
Output genome has the following feature types:
	Coding gene                     6055 
	Non-coding repeat                146 
	Non-coding rna                    53 
Overall, the genes have 2752 distinct functions. 
The genes include 2719 genes with a SEED annotation ontology across 1316 distinct SEED functions.
The number of distinct functions can exceed the number of genes because some genes have multiple functions.
Output from Annotate Microbial Assembly
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/55377
Import a GenBank file from your staging area into your Narrative as a Genome data object
This app completed without errors in 2m 14s.
Objects
Created Object Name Type Description
Starkeya_novella_DSM_506_-_NC_014217.1.gb_genome Genome Imported Genome
Links
Output from Import GenBank File as Genome from Staging Area
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/55377
Runs the CheckM lineage workflow to assess the genome quality of isolates, single cells, or genome bins from metagenome assemblies through comparison to an existing database of genomes.
This app completed without errors in 10m 17s.
Links
Files
These are only available in the live Narrative: https://narrative.kbase.us/narrative/55377
  • CheckM_summary_table.tsv.zip - TSV Summary Table from CheckM
  • full_output.zip - Full output of CheckM
  • plots.zip - Output plots from CheckM

Compute ANI with FastANI

FastANI was used to compare the isolated strain with its relative identified through rRNA sequencing. The comparison only returned 83% identity, suggesting that the strain may be a novel species provisionally referred to as Starkeya rhizosphaerae ORNL1.

Allows users to compute fast whole-genome Average Nucleotide Identity (ANI) estimation.
This app completed without errors in 1m 37s.
Links

Metabolic Modeling

A metabolic model based on the annotated genome object was generated using the Build Metabolic Model App, shown below.

Generate a draft metabolic model based on an annotated genome.
This app completed without errors in 2m 21s.
Objects
Created Object Name Type Description
Starkeya_rhizosphaerae_model FBAModel FBAModel-12 Starkeya_model
Starkeya_model.gf.0 FBA FBA-13 Starkeya_model.gf.0
Report
Output from Build Metabolic Model
The viewer for the output created by this App is available at the original Narrative here: https://narrative.kbase.us/narrative/55377

References

  1. Kelly DP, McDonald IR, Wood AP. 2000. Proposal for the reclassification of Thiobacillus novellus as Starkeya novella gen. nov., comb. nov., in the alpha-subclass of the Proteobacteria. Int J Syst Evol Microbiol 50:1797–1802. doi:10.1099/00207713-50-5-1797
  2. Kappler U, Davenport K, Beatson S, Lucas S, Lapidus A, Copeland A, Berry KW, Glavina Del Rio T, Hammon N, Dalin E, Tice H, Pitluck S, Richardson P, Bruce D, Goodwin LA, Han C, Tapia R, Detter JC, Chang Y-J, Jeffries CD, Land M, Hauser L, Kyrpides NC, Goker M, Ivanova N, Klenk H-P, Woyke T. 2012. Complete genome sequence of the facultatively chemolithoautotrophic and methylotrophic alpha proteobacterium Starkeya novella type strain (ATCC 8093T). Stand Genomic Sci 7:44–58. doi:10.4056/sogs.3006378.
  3. Starkey RL. 1934. Isolation of some bacteria which oxidize thiosulfate. Soil Sci 39:197–220.
  4. Im W-T, Aslam Z, Lee M, Ten LN, Yang D-C, Lee S-T. 2006. Starkeya koreensis sp. nov., isolated from rice straw. Int J Syst Evol Microbiol 56:2409–2414. doi:10.1099/ijs.0.64093-0
  5. Morgulis A, Coulouris G, Raytselis Y, Madden TL, Agarwala R, Schaffer AA. 2008. Database indexing for production MegaBLAST searches. Bioinformatics 24:1757–1764. doi:10.1093/bioinformatics/btn322.
  6. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI Prokaryotic Genome Annotation Pipeline. Nucleic Acids Res 44:6614–6624. doi:10.1093/nar/gkw569.

Released Apps

  1. Annotate Microbial Assembly
    • Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75. doi:10.1186/1471-2164-9-75
    • Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep. 2015;5: 8365. doi:10.1038/srep08365
    • Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206-214. doi:10.1093/nar/gkt1226
  2. Assess Genome Quality with CheckM - v1.0.18
    • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25: 1043 1055. doi:10.1101/gr.186072.114
    • CheckM source:
    • Additional info:
  3. Build Metabolic Model
    • [1] Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28: 977 982. doi:10.1038/nbt.1672
    • [2] Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226
    • [3] Latendresse M. Efficiently gap-filling reaction networks. BMC Bioinformatics. 2014;15: 225. doi:10.1186/1471-2105-15-225
    • [4] Dreyfuss JM, Zucker JD, Hood HM, Ocasio LR, Sachs MS, Galagan JE. Reconstruction and Validation of a Genome-Scale Metabolic Model for the Filamentous Fungus Neurospora crassa Using FARM. PLOS Computational Biology. 2013;9: e1003126. doi:10.1371/journal.pcbi.1003126
    • [5] Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003;5: 264 276.
  4. Import FASTA File as Assembly from Staging Area
    no citations
  5. Import GenBank File as Genome from Staging Area
    no citations

Apps in Beta

  1. Compute ANI with FastANI
    • [1] Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries. 2017; doi:10.1101/225342
    • [2] Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57: 81 91. doi:10.1099/ijs.0.64483-0
    • FastANI module and source code: