Generated November 2, 2021

The moderately (d)efficient enzyme†: Catalysis-related damage in vivo and its repair

†This title echoes that of ‘The Moderately Efficient Enzyme’ landmark paper, in tribute to two of its authors, Arren Bar-Even and Dan Tawfik, whose lives were tragically cut short during the past year.


Bathe, U.,Leong, B. J., McCarty, D. R., Henry, C. S., Abraham, P. E., Wilson, M. A., and Hanson, A. D. (2021) The moderately (d)efficient enzyme†: Catalysis-related damage in vivo and its repair. Biochemistry.


Ulschan Bathe1, Bryan J. Leong1, Donald R. McCarty1, Christopher S. Henry2, Paul E. Abraham3, Mark A. Wilson4 and Andrew D. Hanson1


1Horticultural Sciences Department, University of Florida, Gainesville, Florida 32611
2Computing, Environment, and Life Sciences Division, Argonne National Laboratory, Lemont, Illinois 60439
3Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830
4Department of Biochemistry and Redox Biology Center, University of Nebraska, Lincoln, Nebraska 68588


Enzymes have in vivo lifespans. Analysis of lifespans – lifetime totals of catalytic turnovers – suggests that non-survivable collateral chemical damage from the very reactions that enzymes catalyze is a common but underdiagnosed cause of enzyme death. Analysis also implies that many enzymes are moderately deficient in that their active-site regions are not naturally as hardened against such collateral damage as they could be, leaving room for improvement by rational design or directed evolution. Enzyme lifespan might also be improved by engineering systems that repair otherwise fatal active-site damage, of which a handful are known and more are inferred to exist. Unfortunately, the data needed to design and execute such improvements is lacking: there are too few measurements of in vivo lifespan, and existing information on the extent, nature, and mechanisms of active-site damage and repair during normal enzyme operation is too scarce, anecdotal, and speculative to act on. Fortunately, advances in proteomics, metabolomics, cheminformatics, comparative genomics, and structural biochemistry now empower a systematic, data-driven approach to identify, predict, and validate instances of active-site damage and its repair. These capabilities would be practically useful in enzyme redesign and improvement of in-use stability, and could change thinking about which enzymes die young in vivo, and why.

Narratives summary

This narrative contains the cheminformatics analysis performed for the above referenced publication. We loaded structures to represent the 20 amino acid residues in an approximation of the molecular form they take when polymerized within a protein. Next, we applied our cheminformatics reaction rules on these structures using the PickAxe app in KBase. Finally, we compared the modifications predicted by PickAxe with observed modifications from a defined microbiome published by the Plant Microbe Interaction SFA at Oak Ridge National Laboratory.

Shrestha, H. K., Appidi, M. R., Villalobos Solis, M. I., Wang. J., Carper, D. L., Burdick, L. H., Pelletier, D. A., Doktycz, M. J., Hettich, R. L., and Abraham, P. E. (2021) Metaproteomics reveals insights into microbial structure, interactions, and dynamic regulation in defined communities as they respond to environmental disturbance. BMC Microbiol.

Jupyter notebooks containing code used to perform this comparison are available in github.

Jupyter notebook and data for proteomics/cheminformatics data comparison

Narratives steps

Step 1: Import residue structures
Step 2: PickAxe prediction of residue modifications
Step 3: Viewing PickAxe output within a metabolic model
Step 4: Table with agreement between predicted modifications and proteomics-base modification data


  1. Jeffryes, J. G., Colastani, R. L., Elbadawi-Sidhu, M., Kind, T., Niehaus, T. D., Broadbelt, L. J., Hanson, A. D., Fiehn, O., Tyo, K. E., and Henry, C. S. (2015) MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics. J. Cheminform. 7, 44, DOI: 10.1186/s13321-015-0087-1
  2. Lerma-Ortiz, C., Jeffryes, J. G., Cooper, A. J, Niehaus, T. D., Thamm, A. M, Frelin, O., Aunins, T., Fiehn, O., de Crécy-Lagard, V., Henry, C. S., and Hanson, A. D. (2016) 'Nothing of chemistry disappears in biology': the Top 30 damage-prone endogenous metabolites. Biochem. Soc. Trans. 44, 961–971, DOI: 10.1042/BST20160073
  3. Danchin, A. (2017) Coping with inevitable accidents in metabolism. Microb. Biotechnol. 10, 57–72, DOI: 10.1111/1751-7915.12461
  4. Lai, Z., Kind, T., and Fiehn, O. (2017) Using accurate mass gas chromatography-mass spectrometry with the MINE database for epimetabolite annotation. Anal. Chem. 89, 10171–10180, DOI: 10.1021/acs.analchem.7b01134
  5. Arkin, A. P., Cottingham, R. W., Henry, C. S., Harris, N. L., Stevens, R. L., Maslov, S., et al. (2018) KBase: The United States Department of Energy Systems Biology Knowledgebase. Nat. Biotechnol. 36, 566–569, DOI: 10.1038/nbt.4163
  6. Wang, J., Carper, D. L., Burdick, L. H., Shrestha, H. K., Appidi, M. R., Abraham, P. E., Timm, C. M., Hettich, R. L., Pelletier, D. A., and Doktycz, M. J. (2021) Formation, characterization and modeling of emergent synthetic microbial communities. Comput. Struct. Biotechnol. J. 19, 1917–1927, DOI: 10.1016/j.csbj.2021.03.034
  7. Shrestha, H. K., Appidi, M. R., Villalobos Solis, M. I., Wang. J., Carper, D. L., Burdick, L. H., Pelletier, D. A., Doktycz, M. J., Hettich, R. L., and Abraham, P. E. (2021) Metaproteomics reveals insights into microbial structure, interactions, and dynamic regulation in defined communities as they respond to environmental disturbance. BMC Microbiol.
  8. Creasy, D. M., and Cottrell, J. S. (2004) Unimod: Protein modifications for mass spectrometry. Proteomics 4, 1534–1536, DOI: 10.1002/pmic.200300744
  9. Song, H., and Naismith, J. H. (2020) Enzymatic methylation of the amide bond. Curr. Opin. Struct. Biol. 65, 79–88, DOI: 10.1016/
  10. Ree, R., Varland, S., and Arnesen, T. (2018) Spotlight on protein N-terminal acetylation. Exp. Mol. Med. 50, 1–13, DOI: 10.1038/s12276-018-0116-z
  11. Luo, M. (2018) Chemical and biochemical perspectives of protein lysine methylation. Chem. Rev. 118, 6656–6705, DOI: 10.1021/acs.chemrev.8b00008

Step 1: Import residue structures

We started by importing 20 amino acid residue structures meant to approximate the molecular form of an amino acid when bound within a protein polypeptide. Below we have an example structure for alanine.

These 20 structures were drawn manually and loaded as a compound set in KBase using the following app.

This method imports a file from the staging area as a CompoundSet
This app completed without errors in 32s.
Created Object Name Type Description
AAResidues CompoundSet Compound Set
Imported AA_residues.tsv as AAResidues

Step 2: PickAxe prediction of residue modifications

Cheminformatics approaches like those used in the MINEs[1] and the Chemical-Damage-MINE (CD-MINE)[2] have a proven utility for predicting potential chemical and enzymatic damage reactions and their small-molecule products,[3],[4] and such predictions have been validated by mass spectral evidence.[4] Similarly, cheminformatics tools could predict damage to an enzyme’s residues by its substrates, products, or intermediates, and mass spectral proteomics could provide validation by identifying and quantifying PTMs to specific amino acids. To explore this concept, we used the PickAxe app in KBase to apply the 148 CD-MINE[2] reaction rules to previously imported molecular structures representing each residue within a protein. This operation predicted 252 distinct forms of residue damage.

Generate novel compounds based enzymatic and spontanios reaction rules
This app completed without errors in 2m 25s.
Created Object Name Type Description
AAResidueDamageProducts FBAModel FBAModel-14 AAResidueDamageProducts

Step 3: Viewing PickAxe output within a metabolic model

While the report generated by PickAve above provides detailed information about the compounds and reactions produced by PickAxe, it lacks a mechanism for viewing the structures. The model viewer below enables you to click on each structure link and view it. Note, the model viewer only appears in the actual narrative and not in the static HTML narrative view.

v3 - KBaseFBA.FBAModel-14.0
The viewer for the data in this Cell is available at the original Narrative here:

Step 4: Table with agreement between predicted modifications and proteomics-base modification data

Comparison between amino acid residue modifications predicted from spontaneous reaction rules and those observed in proteomics data. Residues are in single-letter code. Mass Δ is the difference in mass between the modified and unmodified residue. The white numbers in the blue boxes signify the number of unique proteins where the modification was observed. The cyan numbers are the percentage of residues modified in three illustrative cases.

We computed the mass differences between the damaged and undamaged residues, resulting in 36 distinct mass differences predicted across all 20 residues, and compared these differences with PTMs observed in an existing proteomics dataset from a defined microbial community[6,7]. In this dataset, 90 distinct mass differences were observed across all residues, with each observed modification annotated with a predicted chemical mechanism using the PEAKs software[8]. We then aligned the predicted modifications with the observed PTMs based on whether: (i) they resulted in a similar mass difference; (ii) they operated on similar residues; and (iii) the predicted mechanism was similar to that annotated by PEAKs. Many of the predicted damage reactions matche>d the observed mass differences and modification types (see image below) although there were also many exceptions, i.e., where predicted modifications were not observed or observed mass differences had no matching prediction.

Note that differences between the protein damage predictions and mass spectral observations are to be expected. Firstly, predicted modifications that are not observed could be too rare or too unstable to detect or could reflect chemistry, like peptide bond methylation, that can in principle occur spontaneously but in practice does not.[9] Secondly, the mass spectral data capture enzyme-mediated modifications whereas the cheminformatics rules predict only small molecule-mediated spontaneous chemical modifications. The acetylation row of the table below is a case in point: mass spectral analysis detects acetylation of nearly every residue while the rules predict spontaneous acetylation only of residues with a side-chain amino, amide, or thiol group. However, enzymatic acetylation can – and often does[10] – occur on the free α-amino group of any N-terminal residue.

Overall, this preliminary exploration shows promising agreement between predicted residue damage from cheminformatics and mass spectral PTM observations, warranting further work to improve the specificity of reaction rules (e.g., for peptide bond methylation), to develop new rules to capture more of the observed modifications, and to design experiments and proteomics analyses specifically to test these rules. Respecting new rules, the exploration suggests the value in specifically targeting cases where: (i) the modification is not on a terminal residue; (ii) the modification favors enzymes that catalyze similar metabolic reactions, and (iii) the fraction of modified protein is very low, as in arginine and histidine methylation where it is <<1% – and not as in lysine where it is 74% (Figure 1A). The infrequent methylations of arginine and histidine could well be due to occasional chemical damage to side-chain amine groups of active-site residues; the frequent methylation of lysine likely reflects enzymatic modification of non-active site residues.[11]


  1. Import CompoundSet from File
    no citations
  2. PickAxe - Generate novel compounds from reaction rules
    • 'J. Jeffryes, R. Colestani, M. El-Badawi, T. Kind... C. Henry MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics J. Cheminformatics 7:44 (2015)'
    • 'C.Lerma-Ortiz, J.Jeffryes, A.Cooper...C.Henry & A.Hanson Nothing of chemistry disappears in biology : The Top 30 damage-prone metabolites Biochem. Soc. Trans. 44, 961-71 (2016)'