Regression tree-based active learning

被引:0
|
作者
Ashna Jose
João Paulo Almeida de Mendonça
Emilie Devijver
Noël Jakse
Valérie Monbet
Roberta Poloni
机构
[1] Univ. Grenoble Alpes,
[2] CNRS,undefined
[3] Grenoble INP,undefined
[4] SIMaP,undefined
[5] Univ. Grenoble Alpes,undefined
[6] CNRS,undefined
[7] Grenoble INP,undefined
[8] LIG,undefined
[9] Univ. Rennes and Inria,undefined
[10] CNRS,undefined
[11] IRMAR-UMR 6625,undefined
来源
关键词
Active learning; Non-parametric regression; Standard regression trees; Query-based learning;
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning algorithms often require large training sets to perform well, but labeling such large amounts of data is not always feasible, as in many applications, substantial human effort and material cost is needed. Finding effective ways to reduce the size of training sets while maintaining the same performance is then crucial: one wants to choose the best sample of fixed size to be labeled among a given population, aiming at an accurate prediction of the response. This challenge has been studied in detail in classification, but not deeply enough in regression, which is known to be a more difficult task for active learning despite its need in practice. Few model-free active learning methods have been proposed that detect the new samples to be labeled using unlabeled data, but they lack the information of the conditional distribution between the response and the features. In this paper, we propose a standard regression tree-based active learning method for regression that improves significantly upon existing active learning approaches. It provides impressive results for small and large training sets and an appreciably low variance within several runs. We also exploit model-free approaches, and adapt them to our algorithm to utilize maximum information. Through experiments on numerous benchmark datasets, we demonstrate that our framework improves existing methods and is effective in learning a regression model from a very limited labeled dataset, reducing the sample size for a fixed level of performance, even with many features.
引用
收藏
页码:420 / 460
页数:40
相关论文
共 50 条
  • [41] Protein pKa Prediction by Tree-Based Machine Learning
    Chen, Ada Y.
    Lee, Juyong
    Damjanovic, Ana
    Brooks, Bernard R.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (04) : 2673 - 2686
  • [42] Runtime Optimizations for Tree-based Machine Learning Models
    Asadi, Nima
    Lin, Jimmy
    de Vries, Arjen P.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) : 2281 - 2292
  • [43] Tree-based interpretable machine learning of the thermodynamic phases
    Yang, Jintao
    Cao, Junpeng
    [J]. PHYSICS LETTERS A, 2021, 412
  • [44] Learning Tree-based Deep Model for Recommender Systems
    Zhu, Han
    Li, Xiang
    Zhang, Pengye
    Li, Guozheng
    He, Jie
    Li, Han
    Gai, Kun
    [J]. KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1079 - 1088
  • [45] Tree-based Machine Learning Methods for Survey Research
    Kern, Christoph
    Klausch, Thomas
    Kreuter, Frauke
    [J]. SURVEY RESEARCH METHODS, 2019, 13 (01): : 73 - 93
  • [46] Cosmic string detection with tree-based machine learning
    Sadr, A. Vafaei
    Farhang, M.
    Movahed, S. M. S.
    Bassett, B.
    Kunz, M.
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2018, 478 (01) : 1132 - 1140
  • [47] Iteratively specified tree-based regression: Theory and trip generation example
    Washington, S
    [J]. JOURNAL OF TRANSPORTATION ENGINEERING-ASCE, 2000, 126 (06): : 482 - 491
  • [48] Coefficient-Wise Tree-Based Varying Coefficient Regression with vcrpart
    Buergin, Reto
    Ritschard, Gilbert
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 80 (06): : 1 - 33
  • [49] Tree-based classification and regression .2. Assessing classification performance
    Gunter, B
    [J]. QUALITY PROGRESS, 1997, 30 (12) : 83 - 84
  • [50] Prediction of water solubility and Setschenow coefficients by tree-based regression strategies
    De Stefano, Concetta
    Lando, Gabriele
    Malegori, Cristina
    Oliveri, Paolo
    Sammartano, Silvio
    [J]. JOURNAL OF MOLECULAR LIQUIDS, 2019, 282 : 401 - 406