Faster Convergence with Lexicase Selection in Tree-Based Automated Machine Learning

被引:0
|
作者
Matsumoto, Nicholas [1 ]
Saini, Anil Kumar [1 ]
Ribeiro, Pedro [1 ]
Choi, Hyunjun [1 ]
Orlenko, Alena [1 ]
Lyytikainen, Leo-Pekka [2 ]
Laurikka, Jari O. [3 ]
Lehtimaki, Terho [2 ]
Batista, Sandra [1 ]
Moore, Jason H. [1 ]
机构
[1] Cedars Sinai Med Ctr, Los Angeles, CA 90048 USA
[2] Tampere Univ, Tampere, Finland
[3] Sydansairaala Hosp, Tampere, Finland
来源
关键词
Parent Selection; NSGA-II; Lexicase; Convergence; Trie;
D O I
10.1007/978-3-031-29573-7_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many evolutionary computation systems, parent selection methods can affect, among other things, convergence to a solution. In this paper, we present a study comparing the role of two commonly used parent selection methods in evolving machine learning pipelines in an automated machine learning system called Tree-based Pipeline Optimization Tool (TPOT). Specifically, we demonstrate, using experiments on multiple datasets, that lexicase selection leads to significantly faster convergence as compared to NSGA-II in TPOT. We also compare the exploration of parts of the search space by these selection methods using a trie data structure that contains information about the pipelines explored in a particular run.
引用
收藏
页码:165 / 181
页数:17
相关论文
共 50 条
  • [1] Tree-Based Machine Learning Techniques for Automated Human Sleep Stage Classification
    Arslan, Recep Sinan
    Ulutas, Hasan
    Koksal, Ahmet Sertol
    Bakir, Mehmet
    Ciftci, Bulent
    [J]. TRAITEMENT DU SIGNAL, 2023, 40 (04) : 1385 - 1400
  • [2] Fundamental error in tree-based machine learning model selection for reservoir characterisation
    Daniel Asante Otchere
    [J]. Energy Geoscience, 2024, 5 (02) - 228
  • [3] Fundamental error in tree-based machine learning model selection for reservoir characterisation
    Otchere, Daniel Asante
    [J]. ENERGY GEOSCIENCE, 2024, 5 (02):
  • [4] Genetic Analysis of Coronary Artery Disease Using Tree-Based Automated Machine Learning Informed By Biology-Based Feature Selection
    Manduchi, Elisabetta
    Le, Trang T.
    Fu, Weixuan
    Moore, Jason H.
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (03) : 1379 - 1386
  • [5] Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses
    Manduchi, Elisabetta
    Fu, Weixuan
    Romano, Joseph D.
    Ruberto, Stefano
    Moore, Jason H.
    [J]. BMC BIOINFORMATICS, 2020, 21 (01)
  • [6] Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses
    Elisabetta Manduchi
    Weixuan Fu
    Joseph D. Romano
    Stefano Ruberto
    Jason H. Moore
    [J]. BMC Bioinformatics, 21
  • [7] TPOT-NN: augmenting tree-based automated machine learning with neural network estimators
    Romano, Joseph D.
    Le, Trang T.
    Fu, Weixuan
    Moore, Jason H.
    [J]. GENETIC PROGRAMMING AND EVOLVABLE MACHINES, 2021, 22 (02) : 207 - 227
  • [8] Scaling tree-based automated machine learning to biomedical big data with a feature set selector
    Le, Trang T.
    Fu, Weixuan
    Moore, Jason H.
    [J]. BIOINFORMATICS, 2020, 36 (01) : 250 - 256
  • [9] Protein pKa Prediction by Tree-Based Machine Learning
    Chen, Ada Y.
    Lee, Juyong
    Damjanovic, Ana
    Brooks, Bernard R.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (04) : 2673 - 2686
  • [10] Runtime Optimizations for Tree-based Machine Learning Models
    Asadi, Nima
    Lin, Jimmy
    de Vries, Arjen P.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) : 2281 - 2292