Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions

被引:15
|
作者
Watanabe, Naoki [3 ]
Murata, Masahiro [1 ]
Ogawa, Teppei [4 ]
Vavricka, Christopher J. [2 ]
Kondo, Akihiko [2 ]
Ogino, Chiaki [3 ]
Araki, Michihiro [1 ,2 ]
机构
[1] Kyoto Univ, Grad Sch Med, Kyoto 6068507, Japan
[2] Kobe Univ, Grad Sch Sci Technol & Innovat, Kobe, Hyogo 6578501, Japan
[3] Kobe Univ, Grad Sch Engn, Dept Chem Sci & Engn, Kobe, Hyogo 6578501, Japan
[4] Mitsui Knowledge Ind Co Ltd MKI, Kita Ku, Osaka 5300005, Japan
关键词
PHYSICOCHEMICAL FEATURES; WEB SERVER; EC NUMBERS; PROTEINS; PERSPECTIVES; GENERATION; PEPTIDES; ACCURACY; DATABASE; PROFEAT;
D O I
10.1021/acs.jcim.9b00877
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Unannotated gene sequences in databases are increasing due to sequencing advances. Therefore, computational methods to predict functions of unannotated genes are needed. Moreover, novel enzyme discovery for metabolic engineering applications further encourages annotation of sequences. Here, enzyme functions are predicted using two general approaches, each including several machine learning algorithms. First, Enzyme-models (E-models) predict Enzyme Commission (EC) numbers from amino acid sequence information. Second, Substrate-Enzyme models (SEmodels) are built to predict substrates of enzymatic reactions together with EC numbers, and Substrate-Enzyme-Product models (SEP-models) are built to predict substrates, products, and EC numbers. While accuracy of E-models is not optimal, SE-models and SEP-models predict EC numbers and reactions with high accuracy using all tested machine learning-based methods. For example, a single Random Forests-based SEP-model predicts EC first digits with an Average AUC score of over 0.94. Various metrics indicate that the current strategy of combining sequence and chemical structure information is effective at improving enzyme reaction prediction.
引用
收藏
页码:1833 / 1843
页数:11
相关论文
共 50 条
  • [1] Machine Learning-based Models for Predicting the Penetration Depth of Concrete
    Li, Meng
    Wu, Haijun
    Dong, Heng
    Ren, Guang
    Zhang, Peng
    Huang, Fenglei
    [J]. Binggong Xuebao/Acta Armamentarii, 2023, 44 (12): : 3771 - 3782
  • [2] Machine learning-based models for genomic predicting neoadjuvant Machine learning-based models for genomic predicting neoadjuvant chemotherapeutic sensitivity in cervical cancer chemotherapeutic sensitivity in cervical cancer
    Guo, Lu
    Wang, Wei
    Xie, Xiaodong
    Wang, Shuihua
    Zhang, Yudong
    [J]. BIOMEDICINE & PHARMACOTHERAPY, 2023, 159
  • [3] Machine Learning-Based Mapping for Mineral Exploration
    Zuo, Renguang
    Carranza, Emmanuel John M.
    [J]. MATHEMATICAL GEOSCIENCES, 2023, 55 (07) : 891 - 895
  • [4] Machine Learning-Based Mapping for Mineral Exploration
    Renguang Zuo
    Emmanuel John M. Carranza
    [J]. Mathematical Geosciences, 2023, 55 : 891 - 895
  • [5] Performance Evaluation of Machine Learning and Deep Learning-Based Models for Predicting Remaining Capacity of Lithium-Ion Batteries
    Lee, Sang-Hyun
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (16):
  • [6] Machine learning-based models for predicting permeability impairment due to scale deposition
    Ahmadi, Mohammadali
    Chen, Zhangxin
    [J]. JOURNAL OF PETROLEUM EXPLORATION AND PRODUCTION TECHNOLOGY, 2020, 10 (07) : 2873 - 2884
  • [7] Machine learning-based models for predicting permeability impairment due to scale deposition
    Mohammadali Ahmadi
    Zhangxin Chen
    [J]. Journal of Petroleum Exploration and Production Technology, 2020, 10 : 2873 - 2884
  • [8] Scientometric Indicators and Machine Learning-Based Models for Predicting Rising Stars in Academia
    Bin-Obaidellah, Omar
    Al-Fagih, Ashraf E.
    [J]. 2019 7TH INTERNATIONAL CONFERENCE ON SMART COMPUTING & COMMUNICATIONS (ICSCC), 2019, : 1 - 7
  • [9] Evaluation of deep machine learning-based models of soil cumulative infiltration
    Sepahvand, Alireza
    Golkarian, Ali
    Billa, Lawal
    Wang, Kaiwen
    Rezaie, Fatemeh
    Panahi, Somayeh
    Samadianfard, Saeed
    Khosravi, Khabat
    [J]. EARTH SCIENCE INFORMATICS, 2022, 15 (03) : 1861 - 1877
  • [10] Evaluation of deep machine learning-based models of soil cumulative infiltration
    Alireza Sepahvand
    Ali Golkarian
    Lawal Billa
    Kaiwen Wang
    Fatemeh Rezaie
    Somayeh Panahi
    Saeed Samadianfard
    Khabat Khosravi
    [J]. Earth Science Informatics, 2022, 15 : 1861 - 1877