Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions

被引:15
|
作者
Watanabe, Naoki [3 ]
Murata, Masahiro [1 ]
Ogawa, Teppei [4 ]
Vavricka, Christopher J. [2 ]
Kondo, Akihiko [2 ]
Ogino, Chiaki [3 ]
Araki, Michihiro [1 ,2 ]
机构
[1] Kyoto Univ, Grad Sch Med, Kyoto 6068507, Japan
[2] Kobe Univ, Grad Sch Sci Technol & Innovat, Kobe, Hyogo 6578501, Japan
[3] Kobe Univ, Grad Sch Engn, Dept Chem Sci & Engn, Kobe, Hyogo 6578501, Japan
[4] Mitsui Knowledge Ind Co Ltd MKI, Kita Ku, Osaka 5300005, Japan
关键词
PHYSICOCHEMICAL FEATURES; WEB SERVER; EC NUMBERS; PROTEINS; PERSPECTIVES; GENERATION; PEPTIDES; ACCURACY; DATABASE; PROFEAT;
D O I
10.1021/acs.jcim.9b00877
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Unannotated gene sequences in databases are increasing due to sequencing advances. Therefore, computational methods to predict functions of unannotated genes are needed. Moreover, novel enzyme discovery for metabolic engineering applications further encourages annotation of sequences. Here, enzyme functions are predicted using two general approaches, each including several machine learning algorithms. First, Enzyme-models (E-models) predict Enzyme Commission (EC) numbers from amino acid sequence information. Second, Substrate-Enzyme models (SEmodels) are built to predict substrates of enzymatic reactions together with EC numbers, and Substrate-Enzyme-Product models (SEP-models) are built to predict substrates, products, and EC numbers. While accuracy of E-models is not optimal, SE-models and SEP-models predict EC numbers and reactions with high accuracy using all tested machine learning-based methods. For example, a single Random Forests-based SEP-model predicts EC first digits with an Average AUC score of over 0.94. Various metrics indicate that the current strategy of combining sequence and chemical structure information is effective at improving enzyme reaction prediction.
引用
收藏
页码:1833 / 1843
页数:11
相关论文
共 50 条
  • [21] On the interpretability of machine learning-based model for predicting hypertension
    Elshawi, Radwa
    Al-Mallah, Mouaz H.
    Sakr, Sherif
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (1)
  • [22] On the interpretability of machine learning-based model for predicting hypertension
    Radwa Elshawi
    Mouaz H. Al-Mallah
    Sherif Sakr
    [J]. BMC Medical Informatics and Decision Making, 19
  • [23] Evaluation of machine learning-based solutions for health
    Antoniou, Tony
    Mamdani, Muhammad
    [J]. CANADIAN MEDICAL ASSOCIATION JOURNAL, 2021, 193 (44) : E1720 - E1724
  • [24] Machine Learning-Based Design Concept Evaluation
    Camburn, Bradley
    He, Yuejun
    Raviselvam, Sujithra
    Luo, Jianxi
    Wood, Kristin
    [J]. JOURNAL OF MECHANICAL DESIGN, 2020, 142 (03)
  • [25] Machine learning-based exploration of biochar for environmental management and remediation
    Oral, Burcu
    Cosgun, Ahmet
    Guenay, M. Erdem
    Yildirim, Ramazan
    [J]. JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2024, 360
  • [26] Machine learning-based models for predicting mortality and acute kidney injury in critical pulmonary embolism
    Geng Wang
    Jiatang Xu
    Xixia Lin
    Weijie Lai
    Lin Lv
    Senyi Peng
    Kechen Li
    Mingli Luo
    Jiale Chen
    Dongxi Zhu
    Xiong Chen
    Chen Yao
    Shaoxu Wu
    Kai Huang
    [J]. BMC Cardiovascular Disorders, 23
  • [27] Machine learning-based models for predicting clinical outcomes after surgery in unilateral primary aldosteronism
    Kaneko, Hiroki
    Umakoshi, Hironobu
    Ogata, Masatoshi
    Wada, Norio
    Ichijo, Takamasa
    Sakamoto, Shohei
    Watanabe, Tetsuhiro
    Ishihara, Yuki
    Tagami, Tetsuya
    Iwahashi, Norifusa
    Fukumoto, Tazuru
    Terada, Eriko
    Katsuhara, Shunsuke
    Yokomoto-Umakoshi, Maki
    Matsuda, Yayoi
    Sakamoto, Ryuichi
    Ogawa, Yoshihiro
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [28] Machine learning-based models for predicting clinical outcomes after surgery in unilateral primary aldosteronism
    Hiroki Kaneko
    Hironobu Umakoshi
    Masatoshi Ogata
    Norio Wada
    Takamasa Ichijo
    Shohei Sakamoto
    Tetsuhiro Watanabe
    Yuki Ishihara
    Tetsuya Tagami
    Norifusa Iwahashi
    Tazuru Fukumoto
    Eriko Terada
    Shunsuke Katsuhara
    Maki Yokomoto-Umakoshi
    Yayoi Matsuda
    Ryuichi Sakamoto
    Yoshihiro Ogawa
    [J]. Scientific Reports, 12
  • [29] Machine learning-based models for predicting mortality and acute kidney injury in critical pulmonary embolism
    Wang, Geng
    Xu, Jiatang
    Lin, Xixia
    Lai, Weijie
    Lv, Lin
    Peng, Senyi
    Li, Kechen
    Luo, Mingli
    Chen, Jiale
    Zhu, Dongxi
    Chen, Xiong
    Yao, Chen
    Wu, Shaoxu
    Huang, Kai
    [J]. BMC CARDIOVASCULAR DISORDERS, 2023, 23 (01)
  • [30] Algorithmic bias in machine learning-based marketing models
    Akter, Shahriar
    Dwivedi, Yogesh K.
    Sajib, Shahriar
    Biswas, Kumar
    Bandara, Ruwan J.
    Michael, Katina
    [J]. JOURNAL OF BUSINESS RESEARCH, 2022, 144 : 201 - 216