Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions

被引:15
|
作者
Watanabe, Naoki [3 ]
Murata, Masahiro [1 ]
Ogawa, Teppei [4 ]
Vavricka, Christopher J. [2 ]
Kondo, Akihiko [2 ]
Ogino, Chiaki [3 ]
Araki, Michihiro [1 ,2 ]
机构
[1] Kyoto Univ, Grad Sch Med, Kyoto 6068507, Japan
[2] Kobe Univ, Grad Sch Sci Technol & Innovat, Kobe, Hyogo 6578501, Japan
[3] Kobe Univ, Grad Sch Engn, Dept Chem Sci & Engn, Kobe, Hyogo 6578501, Japan
[4] Mitsui Knowledge Ind Co Ltd MKI, Kita Ku, Osaka 5300005, Japan
关键词
PHYSICOCHEMICAL FEATURES; WEB SERVER; EC NUMBERS; PROTEINS; PERSPECTIVES; GENERATION; PEPTIDES; ACCURACY; DATABASE; PROFEAT;
D O I
10.1021/acs.jcim.9b00877
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Unannotated gene sequences in databases are increasing due to sequencing advances. Therefore, computational methods to predict functions of unannotated genes are needed. Moreover, novel enzyme discovery for metabolic engineering applications further encourages annotation of sequences. Here, enzyme functions are predicted using two general approaches, each including several machine learning algorithms. First, Enzyme-models (E-models) predict Enzyme Commission (EC) numbers from amino acid sequence information. Second, Substrate-Enzyme models (SEmodels) are built to predict substrates of enzymatic reactions together with EC numbers, and Substrate-Enzyme-Product models (SEP-models) are built to predict substrates, products, and EC numbers. While accuracy of E-models is not optimal, SE-models and SEP-models predict EC numbers and reactions with high accuracy using all tested machine learning-based methods. For example, a single Random Forests-based SEP-model predicts EC first digits with an Average AUC score of over 0.94. Various metrics indicate that the current strategy of combining sequence and chemical structure information is effective at improving enzyme reaction prediction.
引用
收藏
页码:1833 / 1843
页数:11
相关论文
共 50 条
  • [41] A Machine Learning-Based Model for Predicting the Risk of Cardiovascular Disease
    Hsiao, Chiu-Han
    Yu, Po-Chun
    Hsieh, Chia-Ying
    Zhong, Bing-Zi
    Tsai, Yu-Ling
    Cheng, Hao-min
    Chang, Wei-Lun
    Lin, Frank Yeong-Sung
    Huang, Yennun
    [J]. ADVANCED INFORMATION NETWORKING AND APPLICATIONS, AINA-2022, VOL 1, 2022, 449 : 364 - 374
  • [42] MACHINE LEARNING-BASED MODEL FOR PREDICTING CONCRETE COMPRESSIVE STRENGTH
    Tu Trung Nguyen
    Long Tran Ngoc
    Hoang Hiep Vu
    Tung Pham Thanh
    [J]. INTERNATIONAL JOURNAL OF GEOMATE, 2021, 20 (77): : 197 - 204
  • [43] A machine learning-based framework for predicting game server load
    Çağdaş Özer
    Taner Çevik
    Ahmet Gürhanlı
    [J]. Multimedia Tools and Applications, 2021, 80 : 9527 - 9546
  • [44] Comprehensive Analysis of Clinical Logistic and Machine Learning-Based Models for the Evaluation of Pulmonary Nodules
    Zhang, Kai
    Wei, Zihan
    Nie, Yuntao
    Shen, Haifeng
    Wang, Xin
    Wang, Jun
    Yang, Fan
    Chen, Kezhong
    [J]. JTO CLINICAL AND RESEARCH REPORTS, 2022, 3 (04):
  • [45] Evaluation of machine learning-based models for prediction of clinical deterioration: A systematic literature review
    Jahandideh, Sepideh
    Ozavci, Guncag
    Sahle, Berhe W.
    Kouzani, Abbas Z.
    Magrabi, Farah
    Bucknall, Tracey
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2023, 175
  • [46] Evaluation of different machine learning models and novel deep learning-based algorithm for landslide susceptibility mapping
    Tingyu Zhang
    Yanan Li
    Tao Wang
    Huanyuan Wang
    Tianqing Chen
    Zenghui Sun
    Dan Luo
    Chao Li
    Ling Han
    [J]. Geoscience Letters, 9
  • [47] Evaluation of different machine learning models and novel deep learning-based algorithm for landslide susceptibility mapping
    Zhang, Tingyu
    Li, Yanan
    Wang, Tao
    Wang, Huanyuan
    Chen, Tianqing
    Sun, Zenghui
    Luo, Dan
    Li, Chao
    Han, Ling
    [J]. GEOSCIENCE LETTERS, 2022, 9 (01)
  • [48] Machine learning-based causal models for predicting the response of individual patients to dexamethasone treatment as prophylactic antiemetic
    Taisuke Mizuguchi
    Shigehito Sawamura
    [J]. Scientific Reports, 13
  • [49] Development and comparison of machine learning-based models for predicting heart failure after acute myocardial infarction
    Xuewen Li
    Chengming Shang
    Changyan Xu
    Yiting Wang
    Jiancheng Xu
    Qi Zhou
    [J]. BMC Medical Informatics and Decision Making, 23
  • [50] Machine learning-based causal models for predicting the response of individual patients to dexamethasone treatment as prophylactic antiemetic
    Mizuguchi, Taisuke
    Sawamura, Shigehito
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)