Hyper-parameter optimization of multiple machine learning algorithms for molecular property prediction using hyperopt library

被引:3
|
作者
Jun Zhang [1 ]
Qin Wang [2 ]
Weifeng Shen [1 ,3 ]
机构
[1] School of Chemistry and Chemical Engineering, Chongqing University
[2] School of Chemistry and Chemical Engineering, Chongqing University of Science & Technology
[3] Chongqing Key Laboratory of Theoretical and Computational Chemistry
关键词
D O I
暂无
中图分类号
TP181 [自动推理、机器学习];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to outstanding performance in cheminformatics, machine learning algorithms have been increasingly used to mine molecular properties and biomedical big data. The performance of machine learning models is known to critically depend on the selection of the hyper-parameter configuration. However,many studies either explored the optimal hyper-parameters per the grid searching method or employed arbitrarily selected hyper-parameters, which can easily lead to achieving a suboptimal hyper-parameter configuration. In this study, Hyperopt library embedding with the Bayesian optimization is employed to find optimal hyper-parameters for different machine learning algorithms. Six drug discovery datasets,including solubility, probe-likeness, h ERG, Chagas disease, tuberculosis, and malaria, are used to compare different machine learning algorithms with ECFP6 fingerprints. This contribution aims to evaluate whether the Bernoulli Na?ve Bayes, logistic linear regression, Ada Boost decision tree, random forest, support vector machine, and deep neural networks algorithms with optimized hyper-parameters can offer any improvement in testing as compared with the referenced models assessed by an array of metrics including AUC, F1-score, Cohen’s kappa, Matthews correlation coefficient, recall, precision, and accuracy.Based on the rank normalized score approach, the Hyperopt models achieve better or comparable performance on 33 out 36 models for different drug discovery datasets, showing significant improvement achieved by employing the Hyperopt library. The open-source code of all the 6 machine learning frameworks employed in the Hyperopt python package is provided to make this approach accessible to more scientists, who are not familiar with writing code.
引用
收藏
页码:115 / 125
页数:11
相关论文
共 50 条
  • [1] Hyper-parameter optimization of multiple machine learning algorithms for molecular property prediction using hyperopt library
    Zhang, Jun
    Wang, Qin
    Shen, Weifeng
    [J]. CHINESE JOURNAL OF CHEMICAL ENGINEERING, 2022, 52 : 115 - 125
  • [2] Hyper-Parameter Optimization Using MARS Surrogate for Machine-Learning Algorithms
    Li, Yangyang
    Liu, Guangyuan
    Lu, Gao
    Jiao, Licheng
    Marturi, Naresh
    Shang, Ronghua
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2020, 4 (03): : 287 - 297
  • [3] Hyper-parameter Optimization Using Continuation Algorithms
    Rojas-Delgado, Jairo
    Jimenez, J. A.
    Bello, Rafael
    Lozano, J. A.
    [J]. METAHEURISTICS, MIC 2022, 2023, 13838 : 365 - 377
  • [4] Early prediction model for coronary heart disease using genetic algorithms, hyper-parameter optimization and machine learning techniques
    Priya, R. L.
    Jinny, S. Vinila
    Mate, Yash Vijay
    [J]. HEALTH AND TECHNOLOGY, 2021, 11 (01) : 63 - 73
  • [5] Early prediction model for coronary heart disease using genetic algorithms, hyper-parameter optimization and machine learning techniques
    Priya R. L
    S. Vinila Jinny
    Yash Vijay Mate
    [J]. Health and Technology, 2021, 11 : 63 - 73
  • [6] A Comparative Analysis of Hyperopt as Against Other Approaches for Hyper-Parameter Optimization of XGBoost
    Putatunda, Sayan
    Rama, Kiran
    [J]. 2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MACHINE LEARNING (SPML 2018), 2018, : 6 - 10
  • [7] Hyper-Parameter Optimization in Support Vector Machine on Unbalanced Datasets Using Genetic Algorithms
    Guido, Rosita
    Groccia, Maria Carmela
    Conforti, Domenico
    [J]. OPTIMIZATION IN ARTIFICIAL INTELLIGENCE AND DATA SCIENCES, 2022, : 37 - 47
  • [8] Hybrid photovoltaic/thermal performance prediction based on machine learning algorithms with hyper-parameter tuning
    Ganesan, Karthikeyan
    Palanisamy, Satheeshkumar
    Krishnasamy, Valarmathi
    Salau, Ayodeji Olalekan
    Rathinam, Vinoth
    Seeni Nayakkar, Sankar Ganesh
    [J]. INTERNATIONAL JOURNAL OF SUSTAINABLE ENERGY, 2024, 43 (01)
  • [9] Federated learning with hyper-parameter optimization
    Kundroo, Majid
    Kim, Taehong
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (09)
  • [10] Machine learning-based mortality rate prediction using optimized hyper-parameter
    Khan, Y. A.
    Abbas, S. Z.
    Buu-Chau Truong
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2020, 197