Evaluation and Optimization Methods for Applicability Domain Methods and Their Hyperparameters, Considering the Prediction Performance of Machine Learning Models

被引：3

作者：

Kaneko, Hiromasa ^{[1
]}

机构：

[1] Meiji Univ, Sch Sci & Technol, Dept Appl Chem, Kawasaki, Kanagawa 2148571, Japan

来源：

ACS OMEGA | 2024年 / 9卷 / 10期

关键词：

CRITICAL-TEMPERATURE; DATA SET; QSAR; POINT;

D O I：

10.1021/acsomega.3c08036

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

In molecular, material, and process design and control, the applicability domain (AD) of a mathematical model y = f(x) between properties, activities, and features x is constructed. As there are multiple AD methods, each with its own set of hyperparameters, it is necessary to select an appropriate AD method and hyperparameters for each data set and mathematical model. However, there is no method for optimizing the AD model. This study proposes a method for evaluating and optimizing the AD model for each data set and a mathematical model. Using the predictions of double cross-validation with all samples, the relationship between coverage and root-mean-squared error (RMSE) was calculated for all combinations of AD methods and their hyperparameters, and the area under the coverage and RMSE curve (AUCR) was calculated. The AD model with the lowest AUCR value was selected as the optimal fit for the mathematical model. The proposed method was validated using eight data sets, including molecules, materials, and spectra, demonstrating that the proposed method could generate optimal AD models for all data sets. The Python code for the proposed method is available at https://github.com/hkaneko1985/dcekit.

引用

页码：11453 / 11458

页数：6

共 50 条

[1] Assessment of Machine Learning Reliability Methods for Quantifying the Applicability Domain of QSAR Regression Models
Toplak, Marko
Mocnik, Rok
Polajnar, Matija
Bosnic, Zoran
Carlsson, Lars
Hasselgren, Catrin
Demsar, Janez
Boyer, Scott
Zupan, Blaz
Stalring, Jonna
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (02) : 431 - 441
[2] Performance Evaluation of Machine Learning Methods for Forest Fire Modeling and Prediction
Binh Thai Pham
Jaafari, Abolfazl
Avand, Mohammadtaghi
Al-Ansari, Nadhir
Tran Dinh Du
Hoang Phan Hai Yen
Tran Van Phong
Duy Huu Nguyen
Hiep Van Le
Mafi-Gholami, Davood
Prakash, Indra
Hoang Thi Thuy
Tran Thi Tuyen
SYMMETRY-BASEL, 2020, 12 (06):
[3] Applicability of Machine Learning Methods on Mobile App Effort Estimation: Validation and Performance Evaluation
Pandey, Mamta
Litoriya, Ratnesh
Pandey, Prateek
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (01) : 23 - 41
[4] An evaluation of machine learning methods for domain name classification
Garg, Amit
Trivedi, Nachiket
Lu, Junlan
Eirinaki, Magdalini
Yu, Bin
Olumofin, Femi
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4577 - 4585
[5] Machine learning models for lipophilicity and their domain of applicability
Schroeter, Timon
Schwaighofer, Anton
Mika, Sebastian
Ter Laak, Antonius
Suelzle, Detlev
Ganzer, Ursula
Heinrich, Nikolaus
Mueller, Klaus-Robert
MOLECULAR PHARMACEUTICS, 2007, 4 (04) : 524 - 538
[6] Performance prediction models for sintered NdFeB using machine learning methods and interpretable studies
Qiao, Zuqiang
Dong, Shengzhi
Li, Qing
Lu, Xiangming
Chen, Renjie
Guo, Shuai
Yan, Aru
Li, Wei
JOURNAL OF ALLOYS AND COMPOUNDS, 2023, 963
[7] Evolutionary optimization of machine learning algorithm hyperparameters for strength prediction of high-performance concrete
Singh S.
Patro S.K.
Parhi S.K.
Asian Journal of Civil Engineering, 2023, 24 (8) : 3121 - 3143
[8] Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
Liang, Mang
An, Bingxing
Li, Keanning
Du, Lili
Deng, Tianyu
Cao, Sheng
Du, Yueying
Xu, Lingyang
Gao, Xue
Zhang, Lupei
Li, Junya
Gao, Huijiang
BIOLOGY-BASEL, 2022, 11 (11):
[9] Performance Evaluation of Machine Learning Methods in Cultural Modeling
Li, Xiao-Chen
Mao, Wen-Ji
Zeng, Daniel
Su, Peng
Wang, Fei-Yue
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2009, 24 (06) : 1010 - 1017
[10] Performance Evaluation of Machine Learning Methods in Cultural Modeling
李晓晨
毛文吉
曾大军
苏鹏
王飞跃
Journal of Computer Science & Technology, 2009, 24 (06) : 1010 - 1017

← 1 2 3 4 5 →