Generate a draft metabolic model based on an annotated genome.
The Model SEED pipeline  was implemented within KBase to enable users to build genome-scale metabolic models (GEMs) using data they have imported or generated with other tools in the system. This overview of the Model SEED pipeline details the steps for automated reconstruction of GEMs using KBase.
Step 1 Re-annotating Imported Genomes
Genomes imported into KBase must be re-annotated using the RAST functional ontology (Annotate Microbial Genome) before users can build a draft metabolic model for an organism. This step is necessary because the SEED functional annotations generated by RAST  are linked directly to the biochemical reactions in the ModelSEED biochemistry database, which is used by KBase for metabolic modeling.
Step 2 - Preliminary Reconstruction
Once a genome has been annotated using the RAST functional ontology, it can be fed into the pipeline for preliminary reconstruction, wherein the RAST annotations are used to generate draft metabolic models. Draft metabolic models are comprised of a reaction network complete with gene-protein-reaction (GPR) associations, predicted Gibbs free energy of reaction values, and the biomass reaction. The biomass reaction includes non-universal cofactors, lipids, and cell wall components. The biomass reaction is organism-specific, based on a biomass reaction template, which uses the SEED subsystems and RAST functional annotations to assign non-universal (e.g., cofactors, cell wall components) biomass components that represent unique biological functions exhibited by a large set of organisms or specific to a small set of organisms. The biomass templates can be found on GitHub.
In order for an organism-specific biomass component to be added to the biomass reaction, its genome must contain the proper subsystems and annotations specified in the template. The GPR associations represent the mapping between the biochemical reactions and the standardized functional roles assigned to genes during the RAST annotation. This mapping allows the pipeline to differentiate between cases where protein products from multiple genes form a complex to catalyze a reaction, and cases where protein products from multiple genes can independently catalyze the same reaction. The draft model includes all reactions associated with one or more enzymes encoded in the genome that are identified in the annotations. Additionally, spontaneous reactions are added during this step.
Step 3 Initial Gapfilling
This step is optional, but it is recommended and runs by default. A radio box in the advanced options of the Build Metabolic Modeling App can be unchecked to allow model reconstruction without gapfilling. To gapfill the draft metabolic model or to perform additional gapfilling analysis please see the Gapfill Metabolic Model App.
The quality of draft metabolic model depends on the completeness of the annotated genome used for the preliminary reconstruction. Due to the fact that most genomes are not completely annotated, draft metabolic models usually contain gaps preventing the production of some biomass components. In this step, an optimization algorithm that identifies the minimal set of reactions that must be added to each model to fill these gaps [3, 4]. The gapfilling algorithm is described in detail here. Reactions to be used by gapfilling are selected from the Model SEED biochemistry database. This curated database contains mass and charge balanced reactions, standardized to aqueous conditions at neutral pH. The Model SEED reaction database integrates biochemistry contained KEGG, MetaCyc, EcoCyc, Plant BioCyc, Plant Metabolic Networks, and Gramene. This step is conducted to ensure that every model is capable of simulating cell growth.
Step 4 Flux Balance Analysis
Once model reconstruction is complete, the Flux Balance Analysis (FBA) can be applied to assess the capacity of reactions to carry flux and reaction essentiality. The Run FBA method uses Flux Variability Analysis (FVA)  to classify the reactions in the KBase models as essential, active or blocked. Reactions that must carry flux for growth to occur are classified as essential; reactions that only optionally carry flux are classified as active; and reactions that are unable to carry flux are classified as blocked. Genes encoding reactions that were classified as essential were subsequently classified as essential, as long as alternative isozymes did not exist for these genes. Additionally, FBA is used to iteratively assess which compounds in the in silico media formulation are essential for the metabolic model to be able to produce biomass. These results provide clues for additional manual curation efforts towards completely annotating the genome.
Team members who developed & deployed algorithm in KBase: Chris Henry, Janaka Edirisinghe, Sam Seaver, Neal Conrad. For questions, please contact us.
-  Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28: 977 982. doi:10.1038/nbt.1672 , https://www.ncbi.nlm.nih.gov/pubmed/20802497
-  Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226 , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965101/
-  Latendresse M. Efficiently gap-filling reaction networks. BMC Bioinformatics. 2014;15: 225. doi:10.1186/1471-2105-15-225 , https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-225
-  Dreyfuss JM, Zucker JD, Hood HM, Ocasio LR, Sachs MS, Galagan JE. Reconstruction and Validation of a Genome-Scale Metabolic Model for the Filamentous Fungus Neurospora crassa Using FARM. PLOS Computational Biology. 2013;9: e1003126. doi:10.1371/journal.pcbi.1003126 , https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003126
-  Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003;5: 264 276. , https://www.ncbi.nlm.nih.gov/pubmed/14642354
Module Commit: 584206644abfeb5f3184783aaa27b3a0993ca583