App Catalog
Sign Up Sign In
Build Metabolic Model


By: chenry


Generate a draft metabolic model based on an annotated genome.

The Model SEED pipeline [1] was implemented within KBase to enable users to build genome-scale metabolic models (GEMs) using data they have imported or generated with other tools in the system. This overview of the Model SEED pipeline details the steps for automated reconstruction of GEMs using KBase.

Step 1 Re-annotating Imported Genomes

Genomes imported into KBase must be re-annotated using the RAST functional ontology (Annotate Microbial Genome) before users can build a draft metabolic model for an organism. This step is necessary because the SEED functional annotations generated by RAST [2] are linked directly to the biochemical reactions in the ModelSEED biochemistry database, which is used by KBase for metabolic modeling.

Step 2 - Preliminary Reconstruction

Once a genome has been annotated using the RAST functional ontology, it can be fed into the pipeline for preliminary reconstruction, wherein the RAST annotations are used to generate draft metabolic models. Draft metabolic models are comprised of a reaction network complete with gene-protein-reaction (GPR) associations, predicted Gibbs free energy of reaction values, and the biomass reaction. The biomass reaction includes non-universal cofactors, lipids, and cell wall components. The biomass reaction is organism-specific, based on a biomass reaction template, which uses the SEED subsystems and RAST functional annotations to assign non-universal (e.g., cofactors, cell wall components) biomass components that represent unique biological functions exhibited by a large set of organisms or specific to a small set of organisms. The biomass templates can be found on GitHub.

In order for an organism-specific biomass component to be added to the biomass reaction, its genome must contain the proper subsystems and annotations specified in the template. The GPR associations represent the mapping between the biochemical reactions and the standardized functional roles assigned to genes during the RAST annotation. This mapping allows the pipeline to differentiate between cases where protein products from multiple genes form a complex to catalyze a reaction, and cases where protein products from multiple genes can independently catalyze the same reaction. The draft model includes all reactions associated with one or more enzymes encoded in the genome that are identified in the annotations. Additionally, spontaneous reactions are added during this step.

Step 3 Initial Gapfilling

This step is optional, but it is recommended and runs by default. A radio box in the advanced options of the Build Metabolic Modeling App can be unchecked to allow model reconstruction without gapfilling. To gapfill the draft metabolic model or to perform additional gapfilling analysis please see the Gapfill Metabolic Model App.

The quality of draft metabolic model depends on the completeness of the annotated genome used for the preliminary reconstruction. Due to the fact that most genomes are not completely annotated, draft metabolic models usually contain gaps preventing the production of some biomass components. In this step, an optimization algorithm that identifies the minimal set of reactions that must be added to each model to fill these gaps [3, 4]. The gapfilling algorithm is described in detail here. Reactions to be used by gapfilling are selected from the Model SEED biochemistry database. This curated database contains mass and charge balanced reactions, standardized to aqueous conditions at neutral pH. The Model SEED reaction database integrates biochemistry contained KEGG, MetaCyc, EcoCyc, Plant BioCyc, Plant Metabolic Networks, and Gramene. This step is conducted to ensure that every model is capable of simulating cell growth.

Step 4 Flux Balance Analysis

Once model reconstruction is complete, the Flux Balance Analysis (FBA) can be applied to assess the capacity of reactions to carry flux and reaction essentiality. The Run FBA method uses Flux Variability Analysis (FVA) [5] to classify the reactions in the KBase models as essential, active or blocked. Reactions that must carry flux for growth to occur are classified as essential; reactions that only optionally carry flux are classified as active; and reactions that are unable to carry flux are classified as blocked. Genes encoding reactions that were classified as essential were subsequently classified as essential, as long as alternative isozymes did not exist for these genes. Additionally, FBA is used to iteratively assess which compounds in the in silico media formulation are essential for the metabolic model to be able to produce biomass. These results provide clues for additional manual curation efforts towards completely annotating the genome.

Team members who developed & deployed algorithm in KBase: Chris Henry, Janaka Edirisinghe, Sam Seaver, Neal Conrad. For questions, please contact us.

Related Publications

App Specification:

Module Commit: 584206644abfeb5f3184783aaa27b3a0993ca583