Evaluation and Optimization Methods for Applicability Domain Methods and Their Hyperparameters, Considering the Prediction Performance of Machine Learning Models

被引:3
|
作者
Kaneko, Hiromasa [1 ]
机构
[1] Meiji Univ, Sch Sci & Technol, Dept Appl Chem, Kawasaki, Kanagawa 2148571, Japan
来源
ACS OMEGA | 2024年 / 9卷 / 10期
关键词
CRITICAL-TEMPERATURE; DATA SET; QSAR; POINT;
D O I
10.1021/acsomega.3c08036
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In molecular, material, and process design and control, the applicability domain (AD) of a mathematical model y = f(x) between properties, activities, and features x is constructed. As there are multiple AD methods, each with its own set of hyperparameters, it is necessary to select an appropriate AD method and hyperparameters for each data set and mathematical model. However, there is no method for optimizing the AD model. This study proposes a method for evaluating and optimizing the AD model for each data set and a mathematical model. Using the predictions of double cross-validation with all samples, the relationship between coverage and root-mean-squared error (RMSE) was calculated for all combinations of AD methods and their hyperparameters, and the area under the coverage and RMSE curve (AUCR) was calculated. The AD model with the lowest AUCR value was selected as the optimal fit for the mathematical model. The proposed method was validated using eight data sets, including molecules, materials, and spectra, demonstrating that the proposed method could generate optimal AD models for all data sets. The Python code for the proposed method is available at https://github.com/hkaneko1985/dcekit.
引用
收藏
页码:11453 / 11458
页数:6
相关论文
共 50 条
  • [41] The prediction and optimization of Hydraulic fracturing by integrating the numerical simulation and the machine learning methods
    Li Lizhe
    Zhou Fujian
    Zhou You
    Cai Zhuolin
    Wang Bo
    Zhao Yingying
    Lu Yutian
    ENERGY REPORTS, 2022, 8 : 15338 - 15349
  • [42] Improving Streamflow Prediction Using Multiple Hydrological Models and Machine Learning Methods
    Solanki, Hiren
    Vegad, Urmin
    Kushwaha, Anuj
    Mishra, Vimal
    WATER RESOURCES RESEARCH, 2025, 61 (01)
  • [43] Stability of clinical prediction models developed using statistical or machine learning methods
    Riley, Richard D.
    Collins, Gary S.
    BIOMETRICAL JOURNAL, 2023, 65 (08)
  • [44] A survey on machine learning methods for churn prediction
    Louis Geiler
    Séverine Affeldt
    Mohamed Nadif
    International Journal of Data Science and Analytics, 2022, 14 : 217 - 242
  • [45] Machine learning methods for metabolic pathway prediction
    Joseph M Dale
    Liviu Popescu
    Peter D Karp
    BMC Bioinformatics, 11 (1)
  • [46] Machine Learning Methods for Quality Prediction in Production
    Sankhye, Sidharth
    Hu, Guiping
    LOGISTICS-BASEL, 2020, 4 (04):
  • [47] Machine learning methods for metabolic pathway prediction
    Dale, Joseph M.
    Popescu, Liviu
    Karp, Peter D.
    BMC Bioinformatics, 2010, 11
  • [48] Machine Learning Methods for Septic Shock Prediction
    Darwiche, Aiman
    Mukherjee, Sumitra
    AIVR 2018: 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY, 2018, : 104 - 110
  • [49] Machine Learning Methods for Smartphone Application Prediction
    Lu, Enze
    Zhang, Long
    2022 IEEE 31ST INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2022, : 1174 - 1179
  • [50] Risk prediction with machine learning and regression methods
    Steyerberg, Ewout W.
    van der Ploeg, Tjeerd
    Van Calster, Ben
    BIOMETRICAL JOURNAL, 2014, 56 (04) : 601 - 606