Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization

被引:11
|
作者
Liang, Mang [1 ]
An, Bingxing [1 ]
Li, Keanning [1 ]
Du, Lili [1 ]
Deng, Tianyu [1 ]
Cao, Sheng [1 ]
Du, Yueying [1 ]
Xu, Lingyang [1 ]
Gao, Xue [1 ]
Zhang, Lupei [1 ]
Li, Junya [1 ]
Gao, Huijiang [1 ]
机构
[1] Chinese Acad Agr Sci, Inst Anim Sci, Beijing 100193, Peoples R China
来源
BIOLOGY-BASEL | 2022年 / 11卷 / 11期
关键词
hyperparameters optimization; tree-structured Parzen estimator; genomic prediction; machine learning; SELECTION; ACCURACY; WHEAT; TOOL;
D O I
10.3390/biology11111647
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Simple Summary Machine learning has been a crucial implement for genomic prediction. However, the complicated process of tuning hyperparameters tremendously hindered its application in actual breeding programs, especially for people without experience tuning hyperparameters. In this study, we applied a tree-structured Parzen estimator (TPE) to tune the hyperparameters of machine learning methods. Overall, incorporating kernel ridge regression (KRR) with TPE achieved the highest prediction accuracy in simulation and real datasets. Depending on excellent prediction ability, machine learning has been considered the most powerful implement to analyze high-throughput sequencing genome data. However, the sophisticated process of tuning hyperparameters tremendously impedes the wider application of machine learning in animal and plant breeding programs. Therefore, we integrated an automatic tuning hyperparameters algorithm, tree-structured Parzen estimator (TPE), with machine learning to simplify the process of using machine learning for genomic prediction. In this study, we applied TPE to optimize the hyperparameters of Kernel ridge regression (KRR) and support vector regression (SVR). To evaluate the performance of TPE, we compared the prediction accuracy of KRR-TPE and SVR-TPE with the genomic best linear unbiased prediction (GBLUP) and KRR-RS, KRR-Grid, SVR-RS, and SVR-Grid, which tuned the hyperparameters of KRR and SVR by using random search (RS) and grid search (Gird) in a simulation dataset and the real datasets. The results indicated that KRR-TPE achieved the most powerful prediction ability considering all populations and was the most convenient. Especially for the Chinese Simmental beef cattle and Loblolly pine populations, the prediction accuracy of KRR-TPE had an 8.73% and 6.08% average improvement compared with GBLUP, respectively. Our study will greatly promote the application of machine learning in GP and further accelerate breeding progress.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Exploring and Improving Deep Learning Methods for Genomic Prediction
    Kelly, C.
    Gaffney, A.
    McLaughlin, R.
    HUMAN HEREDITY, 2020, 84 (4-5) : 213 - 213
  • [22] Tunability: Importance of Hyperparameters of Machine Learning Algorithms
    Probst, Philipp
    Boulesteix, Anne-Laure
    Bischl, Bernd
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [23] Review of Optimization in Improving Extreme Learning Machine
    Rathod N.
    Wankhade S.
    EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 2021, 8 (28) : 1 - 13
  • [24] Quantitative basis of machine learning models for genomic prediction
    Syrowatka, Christine
    Machnik, Nick
    Robinson, Matthew
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2023, 31 : 290 - 290
  • [25] Integrating Bioinformatics and Machine Learning for Genomic Prediction in Chickens
    Li, Xiaochang
    Chen, Xiaoman
    Wang, Qiulian
    Yang, Ning
    Sun, Congjiao
    GENES, 2024, 15 (06)
  • [26] Evolutionary Optimization of Hyperparameters in Deep Learning Models
    Kim, Jin-Young
    Cho, Sung-Bae
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 831 - 837
  • [27] Improving multi-trait genomic prediction by incorporating local genetic correlations
    Teng, Jun
    Zhai, Tingting
    Zhang, Xinyi
    Zhao, Changheng
    Wang, Wenwen
    Tang, Hui
    Ning, Chao
    Shang, Yingli
    Wang, Dan
    Zhang, Qin
    COMMUNICATIONS BIOLOGY, 2025, 8 (01)
  • [28] An advanced framework for net electricity consumption prediction: Incorporating novel machine learning models and optimization algorithms
    Li, Xuetao
    Wang, Ziwei
    Yang, Chengying
    Bozkurt, Ayhan
    ENERGY, 2024, 296
  • [29] Enhancing lung cancer detection through hybrid features and machine learning hyperparameters optimization techniques
    Li, Liangyu
    Yang, Jing
    Por, Lip Yee
    Khan, Mohammad Shahbaz
    Hamdaoui, Rim
    Hussain, Lal
    Iqbal, Zahoor
    Rotaru, Ionela Magdalena
    Dobrota, Dan
    Aldrdery, Moutaz
    Omar, Abdulfattah
    HELIYON, 2024, 10 (04)
  • [30] Incorporating domain knowledge in machine learning for soccer outcome prediction
    Daniel Berrar
    Philippe Lopes
    Werner Dubitzky
    Machine Learning, 2019, 108 : 97 - 126