Regression tree-based active learning

被引:0
|
作者
Ashna Jose
João Paulo Almeida de Mendonça
Emilie Devijver
Noël Jakse
Valérie Monbet
Roberta Poloni
机构
[1] Univ. Grenoble Alpes,
[2] CNRS,undefined
[3] Grenoble INP,undefined
[4] SIMaP,undefined
[5] Univ. Grenoble Alpes,undefined
[6] CNRS,undefined
[7] Grenoble INP,undefined
[8] LIG,undefined
[9] Univ. Rennes and Inria,undefined
[10] CNRS,undefined
[11] IRMAR-UMR 6625,undefined
来源
关键词
Active learning; Non-parametric regression; Standard regression trees; Query-based learning;
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning algorithms often require large training sets to perform well, but labeling such large amounts of data is not always feasible, as in many applications, substantial human effort and material cost is needed. Finding effective ways to reduce the size of training sets while maintaining the same performance is then crucial: one wants to choose the best sample of fixed size to be labeled among a given population, aiming at an accurate prediction of the response. This challenge has been studied in detail in classification, but not deeply enough in regression, which is known to be a more difficult task for active learning despite its need in practice. Few model-free active learning methods have been proposed that detect the new samples to be labeled using unlabeled data, but they lack the information of the conditional distribution between the response and the features. In this paper, we propose a standard regression tree-based active learning method for regression that improves significantly upon existing active learning approaches. It provides impressive results for small and large training sets and an appreciably low variance within several runs. We also exploit model-free approaches, and adapt them to our algorithm to utilize maximum information. Through experiments on numerous benchmark datasets, we demonstrate that our framework improves existing methods and is effective in learning a regression model from a very limited labeled dataset, reducing the sample size for a fixed level of performance, even with many features.
引用
收藏
页码:420 / 460
页数:40
相关论文
共 50 条
  • [1] Regression tree-based active learning
    Jose, Ashna
    de Mendonca, Joao Paulo Almeida
    Devijver, Emilie
    Jakse, Noel
    Monbet, Valerie
    Poloni, Roberta
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (02) : 420 - 460
  • [2] Inductive learning of tree-based regression models
    Torgo, L
    [J]. AI COMMUNICATIONS, 2000, 13 (02) : 137 - 138
  • [3] Tree-based classification and regression Part 3: Tree-based procedures
    Gunter, B
    [J]. QUALITY PROGRESS, 1998, 31 (02) : 121 - 123
  • [4] Tree-based regression for a circular response
    Lund, UJ
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2002, 31 (09) : 1549 - 1560
  • [5] Tree-Based Ensemble Multi-Task Learning Method for Classification and Regression
    Simm, Jaak
    Magrans De Abril, Ildefons
    Sugiyama, Masashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06) : 1677 - 1681
  • [6] A Tree-Based Solution to Nonlinear Regression Problem
    Demir, Oguzhan
    Mohaghegh, Mohammadreza N.
    Delibalta, Ibrahim
    Kozat, Suleyman S.
    [J]. 2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1233 - 1236
  • [7] Tree-based model checking for logistic regression
    Su, Xiaogang
    [J]. STATISTICS IN MEDICINE, 2007, 26 (10) : 2154 - 2169
  • [8] Comparison of tree-based ensemble models for regression
    Park, Sangho
    Kim, Chanmin
    [J]. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2022, 29 (05) : 561 - 590
  • [9] A note on the interpretation of tree-based regression models
    Gottard, Anna
    Vannucci, Giulia
    Marchetti, Giovanni Maria
    [J]. BIOMETRICAL JOURNAL, 2020, 62 (06) : 1564 - 1573
  • [10] Polya tree-based nearest neighborhood regression
    Haoxin Zhuang
    Liqun Diao
    Grace Yi
    [J]. Statistics and Computing, 2022, 32