Evaluation and Optimization Methods for Applicability Domain Methods and Their Hyperparameters, Considering the Prediction Performance of Machine Learning Models

被引:3
|
作者
Kaneko, Hiromasa [1 ]
机构
[1] Meiji Univ, Sch Sci & Technol, Dept Appl Chem, Kawasaki, Kanagawa 2148571, Japan
来源
ACS OMEGA | 2024年 / 9卷 / 10期
关键词
CRITICAL-TEMPERATURE; DATA SET; QSAR; POINT;
D O I
10.1021/acsomega.3c08036
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In molecular, material, and process design and control, the applicability domain (AD) of a mathematical model y = f(x) between properties, activities, and features x is constructed. As there are multiple AD methods, each with its own set of hyperparameters, it is necessary to select an appropriate AD method and hyperparameters for each data set and mathematical model. However, there is no method for optimizing the AD model. This study proposes a method for evaluating and optimizing the AD model for each data set and a mathematical model. Using the predictions of double cross-validation with all samples, the relationship between coverage and root-mean-squared error (RMSE) was calculated for all combinations of AD methods and their hyperparameters, and the area under the coverage and RMSE curve (AUCR) was calculated. The AD model with the lowest AUCR value was selected as the optimal fit for the mathematical model. The proposed method was validated using eight data sets, including molecules, materials, and spectra, demonstrating that the proposed method could generate optimal AD models for all data sets. The Python code for the proposed method is available at https://github.com/hkaneko1985/dcekit.
引用
收藏
页码:11453 / 11458
页数:6
相关论文
共 50 条
  • [1] Assessment of Machine Learning Reliability Methods for Quantifying the Applicability Domain of QSAR Regression Models
    Toplak, Marko
    Mocnik, Rok
    Polajnar, Matija
    Bosnic, Zoran
    Carlsson, Lars
    Hasselgren, Catrin
    Demsar, Janez
    Boyer, Scott
    Zupan, Blaz
    Stalring, Jonna
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (02) : 431 - 441
  • [2] Performance Evaluation of Machine Learning Methods for Forest Fire Modeling and Prediction
    Binh Thai Pham
    Jaafari, Abolfazl
    Avand, Mohammadtaghi
    Al-Ansari, Nadhir
    Tran Dinh Du
    Hoang Phan Hai Yen
    Tran Van Phong
    Duy Huu Nguyen
    Hiep Van Le
    Mafi-Gholami, Davood
    Prakash, Indra
    Hoang Thi Thuy
    Tran Thi Tuyen
    SYMMETRY-BASEL, 2020, 12 (06):
  • [3] Applicability of Machine Learning Methods on Mobile App Effort Estimation: Validation and Performance Evaluation
    Pandey, Mamta
    Litoriya, Ratnesh
    Pandey, Prateek
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (01) : 23 - 41
  • [4] An evaluation of machine learning methods for domain name classification
    Garg, Amit
    Trivedi, Nachiket
    Lu, Junlan
    Eirinaki, Magdalini
    Yu, Bin
    Olumofin, Femi
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4577 - 4585
  • [5] Machine learning models for lipophilicity and their domain of applicability
    Schroeter, Timon
    Schwaighofer, Anton
    Mika, Sebastian
    Ter Laak, Antonius
    Suelzle, Detlev
    Ganzer, Ursula
    Heinrich, Nikolaus
    Mueller, Klaus-Robert
    MOLECULAR PHARMACEUTICS, 2007, 4 (04) : 524 - 538
  • [6] Performance prediction models for sintered NdFeB using machine learning methods and interpretable studies
    Qiao, Zuqiang
    Dong, Shengzhi
    Li, Qing
    Lu, Xiangming
    Chen, Renjie
    Guo, Shuai
    Yan, Aru
    Li, Wei
    JOURNAL OF ALLOYS AND COMPOUNDS, 2023, 963
  • [7] Evolutionary optimization of machine learning algorithm hyperparameters for strength prediction of high-performance concrete
    Singh S.
    Patro S.K.
    Parhi S.K.
    Asian Journal of Civil Engineering, 2023, 24 (8) : 3121 - 3143
  • [8] Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
    Liang, Mang
    An, Bingxing
    Li, Keanning
    Du, Lili
    Deng, Tianyu
    Cao, Sheng
    Du, Yueying
    Xu, Lingyang
    Gao, Xue
    Zhang, Lupei
    Li, Junya
    Gao, Huijiang
    BIOLOGY-BASEL, 2022, 11 (11):
  • [9] Performance Evaluation of Machine Learning Methods in Cultural Modeling
    Li, Xiao-Chen
    Mao, Wen-Ji
    Zeng, Daniel
    Su, Peng
    Wang, Fei-Yue
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2009, 24 (06) : 1010 - 1017
  • [10] Performance Evaluation of Machine Learning Methods in Cultural Modeling
    李晓晨
    毛文吉
    曾大军
    苏鹏
    王飞跃
    Journal of Computer Science & Technology, 2009, 24 (06) : 1010 - 1017