Multiclass Synthetic Accessibility Prediction

被引:0
|
作者
Li, Xinqi [1 ]
Walsh, Ryan [2 ,3 ]
Abbas, Waseem [1 ]
Pascual-Diaz, Sergio [1 ]
Hand, Calum [1 ]
Garland, Rory [1 ]
Khan, Faiz Mohammad [1 ]
Das, Nikhil Mohan [1 ]
Desai, Vedant [1 ]
Abouzleikha, Mohamed [1 ]
Clark, Matthew A. [3 ]
机构
[1] X Chem UK, Altrincham WA14 2DT, Cheshire, England
[2] X Chem Canada, Montreal, PQ H4R 2P1, Canada
[3] X Chem Global HQ, Waltham, MA 02453 USA
关键词
D O I
10.1021/acs.jcim.4c01663
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Evaluating synthetic accessibility of in silico molecules is an integral component of the drug discovery process. While the application of machine learning models to predict whether small molecules are easy or hard to synthesize has gained attention recently, predetermined thresholds and data set imbalances present challenges for these binary classification approaches. In this study, we introduce a novel multiclass fold-ensembled classification approach to predict the minimum number of steps needed to synthesize a small molecule. By ensembling the base models trained on multiple stratified subsampled folds, this approach effectively mitigates the impact of class imbalance through probability aggregation or voting aggregation strategies. Additionally, we propose fuzzy evaluation metrics that account for practical tolerances in predictions, providing a more flexible and realistic assessment of model performance. Through experimentation on two reaction benchmark data sets, we demonstrate the effectiveness of our model in a multiclass synthetic accessibility prediction task and the superiority of our proposed method over six existing models in binary synthetic accessibility prediction tasks.
引用
收藏
页码:1155 / 1165
页数:11
相关论文
共 50 条
  • [21] Multiclass Prediction Model for Student Grade Prediction Using Machine Learning
    Bujang, Siti Dianah Abdul
    Selamat, Ali
    Ibrahim, Roliana
    Krejcar, Ondrej
    Herrera-Viedma, Enrique
    Fujita, Hamido
    Ghani, Nor Azura Md.
    IEEE ACCESS, 2021, 9 : 95608 - 95621
  • [22] A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
    Wang, Jiao
    Awang, Norhashidah
    IEEE ACCESS, 2025, 13 : 6054 - 6066
  • [23] Legal Judgment Prediction Based on Multiclass Information Fusion
    Zhu, Kongfan
    Guo, Rundong
    Hu, Weifeng
    Li, Zeqiang
    Li, Yujun
    COMPLEXITY, 2020, 2020
  • [24] Structure and reaction based evaluation of synthetic accessibility
    Gasteiger, Johann
    Seidel, Thomas
    Boda, Krisztina
    Herwig, Achim
    Sacher, Oliver
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2006, 231
  • [25] Prediction of Physicochemical Properties, Bioactivity, Pharmacokinetics, Drug-Likeness and Synthetic Accessibility of Phenyl and Isoxazole AZO Dyes
    Kuchana, Madhavi
    Kaparthi, Kavya
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2021, 14 (05): : 301 - 305
  • [26] Structure and reaction based evaluation of synthetic accessibility
    Boda, Krisztina
    Seidel, Thomas
    Gasteiger, Johann
    JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2007, 21 (06) : 311 - 325
  • [27] Steric influences on the stability and synthetic accessibility of tetraalkylcyclopentadienes
    Overby, JS
    Brady, ED
    Slate, SC
    Hanusa, TP
    JOURNAL OF MOLECULAR STRUCTURE, 1999, 478 (1-3) : 163 - 168
  • [28] Synthetic accessibility assessment using auxiliary responses
    Ito, Shun
    Baba, Yukino
    Isomura, Tetsu
    Kashima, Hisashi
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 145
  • [29] Structure and reaction based evaluation of synthetic accessibility
    Krisztina Boda
    Thomas Seidel
    Johann Gasteiger
    Journal of Computer-Aided Molecular Design, 2007, 21 : 311 - 325
  • [30] A multiclass machine learning approach to credit rating prediction
    Ye, Yun
    Liu, Shufen
    Li, Jinyu
    2008 INTERNATIONAL SYMPOSIUM ON INFORMATION PROCESSING AND 2008 INTERNATIONAL PACIFIC WORKSHOP ON WEB MINING AND WEB-BASED APPLICATION, 2008, : 57 - 61