Regression tree-based active learning

被引：0

作者：

Ashna Jose

João Paulo Almeida de Mendonça

Emilie Devijver

Noël Jakse

Valérie Monbet

Roberta Poloni

机构：

[1] Univ. Grenoble Alpes,

[2] CNRS,undefined

[3] Grenoble INP,undefined

[4] SIMaP,undefined

[5] Univ. Grenoble Alpes,undefined

[6] CNRS,undefined

[7] Grenoble INP,undefined

[8] LIG,undefined

[9] Univ. Rennes and Inria,undefined

[10] CNRS,undefined

[11] IRMAR-UMR 6625,undefined

来源：

Data Mining and Knowledge Discovery | 2024年 / 38卷

关键词：

Active learning; Non-parametric regression; Standard regression trees; Query-based learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Machine learning algorithms often require large training sets to perform well, but labeling such large amounts of data is not always feasible, as in many applications, substantial human effort and material cost is needed. Finding effective ways to reduce the size of training sets while maintaining the same performance is then crucial: one wants to choose the best sample of fixed size to be labeled among a given population, aiming at an accurate prediction of the response. This challenge has been studied in detail in classification, but not deeply enough in regression, which is known to be a more difficult task for active learning despite its need in practice. Few model-free active learning methods have been proposed that detect the new samples to be labeled using unlabeled data, but they lack the information of the conditional distribution between the response and the features. In this paper, we propose a standard regression tree-based active learning method for regression that improves significantly upon existing active learning approaches. It provides impressive results for small and large training sets and an appreciably low variance within several runs. We also exploit model-free approaches, and adapt them to our algorithm to utilize maximum information. Through experiments on numerous benchmark datasets, we demonstrate that our framework improves existing methods and is effective in learning a regression model from a very limited labeled dataset, reducing the sample size for a fixed level of performance, even with many features.

引用

页码：420 / 460

页数：40

共 50 条

[41] Protein pKa Prediction by Tree-Based Machine Learning
Chen, Ada Y.
Lee, Juyong
Damjanovic, Ana
Brooks, Bernard R.
[J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (04) : 2673 - 2686
[42] Runtime Optimizations for Tree-based Machine Learning Models
Asadi, Nima
Lin, Jimmy
de Vries, Arjen P.
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) : 2281 - 2292
[43] Tree-based interpretable machine learning of the thermodynamic phases
Yang, Jintao
Cao, Junpeng
[J]. PHYSICS LETTERS A, 2021, 412
[44] Learning Tree-based Deep Model for Recommender Systems
Zhu, Han
Li, Xiang
Zhang, Pengye
Li, Guozheng
He, Jie
Li, Han
Gai, Kun
[J]. KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1079 - 1088
[45] Tree-based Machine Learning Methods for Survey Research
Kern, Christoph
Klausch, Thomas
Kreuter, Frauke
[J]. SURVEY RESEARCH METHODS, 2019, 13 (01): : 73 - 93
[46] Cosmic string detection with tree-based machine learning
Sadr, A. Vafaei
Farhang, M.
Movahed, S. M. S.
Bassett, B.
Kunz, M.
[J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2018, 478 (01) : 1132 - 1140
[47] Iteratively specified tree-based regression: Theory and trip generation example
Washington, S
[J]. JOURNAL OF TRANSPORTATION ENGINEERING-ASCE, 2000, 126 (06): : 482 - 491
[48] Coefficient-Wise Tree-Based Varying Coefficient Regression with vcrpart
Buergin, Reto
Ritschard, Gilbert
[J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 80 (06): : 1 - 33
[49] Tree-based classification and regression .2. Assessing classification performance
Gunter, B
[J]. QUALITY PROGRESS, 1997, 30 (12) : 83 - 84
[50] Prediction of water solubility and Setschenow coefficients by tree-based regression strategies
De Stefano, Concetta
Lando, Gabriele
Malegori, Cristina
Oliveri, Paolo
Sammartano, Silvio
[J]. JOURNAL OF MOLECULAR LIQUIDS, 2019, 282 : 401 - 406

← 1 2 3 4 5 →