Semi-supervised trees for multi-target regression

被引:36
|
作者
Levatic, Jurica [1 ,2 ]
Kocev, Dragi [1 ,2 ,3 ]
Ceci, Michelangelo [3 ,4 ]
Dzeroski, Saso [1 ,2 ]
机构
[1] Josef Stefan Inst, Dept Knowledge Technol, Ljubljana, Slovenia
[2] Jotef Stefan Int Postgrad Sch, Ljubljana, Slovenia
[3] Univ Bari Aldo Moro, Dept Comp Sci, Bari, Italy
[4] CINI, Rome, Italy
基金
欧盟地平线“2020”;
关键词
Semi-supervised learning; Multi-target regression; Structured outputs; Predictive clustering trees; Random forests; CLASSIFICATION; MODEL; PREDICTION; INDUCTION; ENSEMBLES; INDEX;
D O I
10.1016/j.ins.2018.03.033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The predictive performance of traditional supervised methods heavily depends on the amount of labeled data. However, obtaining labels is a difficult process in many real-life tasks, and only a small amount of labeled data is typically available for model learning. As an answer to this problem, the concept of semi-supervised learning has emerged. Semi supervised methods use unlabeled data in addition to labeled data to improve the performance of supervised methods. It is even more difficult to get labeled data for data mining problems with structured outputs since several labels need to be determined for each example. Multi-target regression (MTR) is one type of a structured output prediction problem, where we need to simultaneously predict multiple continuous variables. Despite the apparent need for semi supervised methods able to deal with MTR, only a few such methods are available and even those are difficult to use in practice and/or their advantages over supervised methods for MTR are not clear. This paper presents an extension of predictive clustering trees for MTR and ensembles thereof towards semi-supervised learning. The proposed method preserves the appealing characteristic of decision trees while enabling the use of unlabeled examples. In particular, the proposed semi-supervised trees for MTR are interpretable, easy to understand, fast to learn, and can handle both numeric and nominal descriptive features. We perform an extensive empirical evaluation in both an inductive and a transductive semi-supervised setting. The results show that the proposed method improves the performance of supervised predictive clustering trees and enhances their interpretability (due to reduced tree size), whereas, in the ensemble learning scenario, it outperforms its supervised counterpart in the transductive setting. The proposed methods have a mechanism for controlling the influence of unlabeled examples, which makes them highly useful in practice: This mechanism can protect them against a degradation of performance of their supervised counterparts-an inherent risk of semi-supervised learning. The proposed methods also outperform two existing semi-supervised methods for MTR. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:109 / 127
页数:19
相关论文
共 50 条
  • [21] Safe semi supervised multi-target regression (MTR-SAFER) for new targets learning
    Syed, Farrukh Hasan
    Tahir, Muhammad Atif
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29971 - 29987
  • [22] Semi-Supervised Linear Regression
    Azriel, David
    Brown, Lawrence D.
    Sklar, Michael
    Berk, Richard
    Buja, Andreas
    Zhao, Linda
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (540) : 2238 - 2251
  • [23] Semi-supervised logistic regression
    Amini, MR
    Gallinari, P
    ECAI 2002: 15TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 77 : 390 - 394
  • [24] Semi-supervised kernel regression
    Wang, Meng
    Hua, Xian-Sheng
    Song, Yan
    Dai, Li-Rong
    Zhang, Hong-Jiang
    ICDM 2006: Sixth International Conference on Data Mining, Proceedings, 2006, : 1130 - 1135
  • [25] SEMI-SUPERVISED MULTI-OUTPUT IMAGE MANIFOLD REGRESSION
    Wu, Hui
    Spurlock, Scott
    Souvenir, Richard
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 2413 - 2417
  • [26] Semi-Supervised Graph Imbalanced Regression
    Liu, Gang
    Zhao, Tong
    Inae, Eric
    Luo, Tengfei
    Jiang, Meng
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1453 - 1465
  • [27] Semi-supervised Active Linear Regression
    Devvrit, Fnu
    Rajaraman, Nived
    Awasthi, Pranjal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [28] An improved Laplacian semi-supervised regression
    Kraus, Vivien
    Benkabou, Seif-Eddine
    Benabdeslem, Khalid
    Cherqui, Frederic
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 564 - 570
  • [29] Semi-supervised Regression and System Identification
    Ohlsson, Henrik
    Ljung, Lennart
    THREE DECADES OF PROGRESS IN CONTROL SCIENCES, 2010, : 343 - 360
  • [30] Semi-supervised regression: A recent review
    Kostopoulos, Georgios
    Karlos, Stamatis
    Kotsiantis, Sotiris
    Ragos, Omiros
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (02) : 1483 - 1500