Reconstruct the metabolic network of a plant based on an annotated genome.
The PlantSEED pipeline [1-3] was implemented within KBase to enable users to reconstruct genome-scale metabolic networks of plant primary metabolism using data they have imported or generated with other tools in the system. This overview of the PlantSEED pipeline details the steps for automated reconstruction of plant primary metabolism using KBase.
Step 1 Re-annotating Imported Genomes
Genomes imported into KBase, whether by the user, or copied from the Phytozome Genomes in the publicly available data, must be re-annotated using the PlantSEED functional ontology. This step is necessary because the functional annotations curated by the PlantSEED project are linked directly to the biochemical reactions in the ModelSEED biochemistry, which is used by KBase for metabolic modeling.
There are two approaches to annotate plant genomes, the first is to use a set of signature k-mers that were trained by the PlantSEED project to be unique for each functional annotation. The app for annotating the sequences using k-mers is Annotate Plant Transcripts with Metabolic Functions; this app takes 5-10 minutes to run. The second app (Annotate Plant Enzymes with OrthoFinder) uses a set of protein families generated by OrthoFinder . This app will insert the users' sequences into the families, and cluster them accordingly to the functional annotations; this app takes 6-8 hours to run, but the resulting annotation will have a higher precision.
Step 2 - Reconstruction
Once a genome has been annotated using the PlantSEED functional ontology, it can be fed into this app for the reconstruction of plant primary metabolism, wherein the PlantSEED annotations are used to link the users' sequences to biochemical reactions in the resulting metabolic network, in the form of gene-protein-reaction (GPR) associations. The reconstruction process also adds a plant-specific biomass reaction, curated for the leaf.
The GPR associations, representing the link between the biochemical reactions and the PlantSEED functional annotations, allows the pipeline to differentiate between cases where protein products from multiple genes form a complex to catalyze a reaction, and cases where protein products from multiple genes can independently catalyze the same reaction. The reconstruction includes all reactions that have been curated to be part of plant primary metabolism, and will include the reactions even if a GPR assocation has not been formed. Notably, the reconstruction does not yet include secondary metabolism. Additionally, spontaneous reactions are added during this step.
Step 3 Flux Balance Analysis
Once model reconstruction is complete, the Flux Balance Analysis (FBA) can be applied to assess the capacity of reactions to carry flux and reaction essentiality. The Run FBA method uses Flux Variability Analysis (FVA)  to classify the reactions in the KBase models as essential, active or blocked. Reactions that must carry flux for growth to occur are classified as essential; reactions that only optionally carry flux are classified as active; and reactions that are unable to carry flux are classified as blocked. Genes catalyzing reactions that were classified as essential were subsequently classified as essential, as long as alternative isozymes did not exist for these genes. Essentially, the PlantSEED project provides two publicly available media formulations to be used with the reconstructions of plant primary metabolism: PlantHeterotrophicMedia and PlantAutotrophicMedia. These are so named because they use sucrose and carbon dioxide as the respective sources of carbon.
-  Seaver SMD, Lerma-Ortiz C, Conrad N, Mikaili A, Sreedasyam A, Hanson AD, et al. PlantSEED enables automated annotation and reconstruction of plant primary metabolism with improved compartmentalization and comparative consistency. Plant J. 2018;95: 1102 1113. doi:10.1111/tpj.14003 , https://www.ncbi.nlm.nih.gov/pubmed/29924895
-  Seaver SMD, Gerdes S, Frelin O, Lerma-Ortiz C, Bradbury LMT, Zallot R, et al. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource. Proc Natl Acad Sci USA. 2014;111: 9645 9650. doi:10.1073/pnas.1401329111 , https://www.ncbi.nlm.nih.gov/pubmed/24927599
-  GitHub source: , https://github.com/ModelSEED/PlantSEED/
-  Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015;16. doi:10.1186/s13059-015-0721-2 , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4531804/
-  Henry CS, DeJongh M, Best AA, Frybarger PM, Linsay B, Stevens RL. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28: 977 982. doi:10.1038/nbt.1672 , https://www.ncbi.nlm.nih.gov/pubmed/20802497
-  Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206 D214. doi:10.1093/nar/gkt1226 , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965101/
-  Latendresse M. Efficiently gap-filling reaction networks. BMC Bioinformatics. 2014;15: 225. doi:10.1186/1471-2105-15-225 , https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-15-225
-  Dreyfuss JM, Zucker JD, Hood HM, Ocasio LR, Sachs MS, Galagan JE. Reconstruction and Validation of a Genome-Scale Metabolic Model for the Filamentous Fungus Neurospora crassa Using FARM. PLOS Computational Biology. 2013;9: e1003126. doi:10.1371/journal.pcbi.1003126 , https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003126
-  Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003;5: 264 276. , https://www.ncbi.nlm.nih.gov/pubmed/14642354
Module Commit: a8031aa0f4320c7cd96f78170d7644f696614f8d