Merge multiple metabolic annotations into a single merged annotation based on thresholds
The Merge Metabolic Annotations app allows the user to merge metabolic annotations from multiple sources present in the genome, using a simple, user-defined weighted-sum scoring scheme to decide which annotations to include. The resulting merged annotations will be added as a separate new annotation event in the genome, which can then be used for metabolic modeling by the Naive Bayes classifier, calculating the probability that gene X has function Y, given all of the annotation sources.
The default settings assing a weight of 1 to each annotation source, and set the threshold to 1 as well, which is equivalent to taking the union of all annotations. This will result in the largest possible number of genes and reactions, but will also tend to include a large number of false positives.
For additional information about metabolic modeling, visit the Metabolic Modeling in KBase FAQ.
Team members who developed & deployed algorithm in KBase: Jeffrey Kimbrel, Patrik D'haeseleer, Chris Henry. For questions, please contact us.
Related Publications
- [1] Griesemer M, Kimbrel JA, Zhou CE, Navid A, D'haeseleer P. Combining multiple functional annotation tools increases coverage of metabolic annotation. BMC Genomics. 2018 Dec 19;19(1):948. doi: 10.1186/s12864-018-5221-9. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6299973/
- [2] Hanson AD, Pribat A, Waller JC, de Cr cy-Lagard V. Unknown proteins and orphan enzymes: the missing half of the engineering parts list - and how to find it. Biochem J. 2010;425:1 11. doi: 10.1042/BJ20091328. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3022307/
- [3] Ijaq J, Chandrasekharan M, Poddar R, Bethi N, Sundararajan VS. Annotation and curation of uncharacterized proteins- challenges. Front Genet. 2015;6:1750. doi: 10.3389/fgene.2015.00119. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4379932/
- [4] Land M, Hauser L, Jun S-R, Nookaew I, Leuze MR, Ahn T-H, et al. Insights from 20 years of bacterial genome sequencing. Funct Integr Genomics. 2015;15:141 161. doi: 10.1007/s10142-015-0433-4. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4361730/
- [5] Seaver SMD, Liu F, Zhang Q, Jeffryes J, Faria JP, Edirisinghe JN, Mundy M, Chia N, Noor E, Beber ME, Best AA, DeJongh M, Kimbrel JA, D'haeseleer P, McCorkle SR, Bolton JR, Pearson E, Canon S, Wood-Charlson EM, Cottingham RW, Arkin AP, Henry CS. The ModelSEED Biochemistry Database for the integration of metabolic annotations and the reconstruction, comparison and analysis of metabolic models for plants, fungi and microbes. Nucleic Acids Res. 2021 Jan 8;49(D1):D1555. doi: 10.1093/nar/gkaa1143. , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7778962/
App Specification:
https://github.com/jeffkimbrel/MergeMetabolicAnnotations/tree/ec971d114d57942cef73dc2980c8faf48cea7afe/ui/narrative/methods/merge_metabolic_annotationsModule Commit: ec971d114d57942cef73dc2980c8faf48cea7afe