Semi-supervised trees for multi-target regression

被引:36
|
作者
Levatic, Jurica [1 ,2 ]
Kocev, Dragi [1 ,2 ,3 ]
Ceci, Michelangelo [3 ,4 ]
Dzeroski, Saso [1 ,2 ]
机构
[1] Josef Stefan Inst, Dept Knowledge Technol, Ljubljana, Slovenia
[2] Jotef Stefan Int Postgrad Sch, Ljubljana, Slovenia
[3] Univ Bari Aldo Moro, Dept Comp Sci, Bari, Italy
[4] CINI, Rome, Italy
基金
欧盟地平线“2020”;
关键词
Semi-supervised learning; Multi-target regression; Structured outputs; Predictive clustering trees; Random forests; CLASSIFICATION; MODEL; PREDICTION; INDUCTION; ENSEMBLES; INDEX;
D O I
10.1016/j.ins.2018.03.033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The predictive performance of traditional supervised methods heavily depends on the amount of labeled data. However, obtaining labels is a difficult process in many real-life tasks, and only a small amount of labeled data is typically available for model learning. As an answer to this problem, the concept of semi-supervised learning has emerged. Semi supervised methods use unlabeled data in addition to labeled data to improve the performance of supervised methods. It is even more difficult to get labeled data for data mining problems with structured outputs since several labels need to be determined for each example. Multi-target regression (MTR) is one type of a structured output prediction problem, where we need to simultaneously predict multiple continuous variables. Despite the apparent need for semi supervised methods able to deal with MTR, only a few such methods are available and even those are difficult to use in practice and/or their advantages over supervised methods for MTR are not clear. This paper presents an extension of predictive clustering trees for MTR and ensembles thereof towards semi-supervised learning. The proposed method preserves the appealing characteristic of decision trees while enabling the use of unlabeled examples. In particular, the proposed semi-supervised trees for MTR are interpretable, easy to understand, fast to learn, and can handle both numeric and nominal descriptive features. We perform an extensive empirical evaluation in both an inductive and a transductive semi-supervised setting. The results show that the proposed method improves the performance of supervised predictive clustering trees and enhances their interpretability (due to reduced tree size), whereas, in the ensemble learning scenario, it outperforms its supervised counterpart in the transductive setting. The proposed methods have a mechanism for controlling the influence of unlabeled examples, which makes them highly useful in practice: This mechanism can protect them against a degradation of performance of their supervised counterparts-an inherent risk of semi-supervised learning. The proposed methods also outperform two existing semi-supervised methods for MTR. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:109 / 127
页数:19
相关论文
共 50 条
  • [1] Incremental predictive clustering trees for online semi-supervised multi-target regression
    Osojnik, Aljaz
    Panov, Pance
    Dzeroski, Saso
    MACHINE LEARNING, 2020, 109 (11) : 2121 - 2139
  • [2] Incremental predictive clustering trees for online semi-supervised multi-target regression
    Aljaž Osojnik
    Panče Panov
    Sašo Džeroski
    Machine Learning, 2020, 109 : 2121 - 2139
  • [3] Feature selection for semi-supervised multi-target regression using genetic algorithm
    Syed, Farrukh Hasan
    Tahir, Muhammad Atif
    Rafi, Muhammad
    Shahab, Mir Danish
    APPLIED INTELLIGENCE, 2021, 51 (12) : 8961 - 8984
  • [4] Feature selection for semi-supervised multi-target regression using genetic algorithm
    Farrukh Hasan Syed
    Muhammad Atif Tahir
    Muhammad Rafi
    Mir Danish Shahab
    Applied Intelligence, 2021, 51 : 8961 - 8984
  • [5] An ensemble-based semi-supervised feature ranking for multi-target regression problems☆
    Adiyeke, Esra
    Baydogan, Mustafa Gokce
    PATTERN RECOGNITION LETTERS, 2021, 148 : 36 - 42
  • [6] Online Semi-supervised Learning for Multi-target Regression in Data Streams Using AMRules
    Sousa, Ricardo
    Gama, Joao
    ADVANCES IN INTELLIGENT DATA ANALYSIS XV, 2016, 9897 : 123 - 133
  • [7] Survival analysis as semi-supervised multi-target regression for time-to-employment prediction using oblique predictive clustering trees
    Andonovikj, Viktor
    Boskoski, Pavle
    Dzeroski, Saso
    Boshkoska, Biljana Mileva
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [8] A Unified Adversarial Learning Framework for Semi-supervised Multi-target Domain Adaptation
    Wu, Xinle
    Wang, Lei
    Wang, Shuo
    Meng, Xiaofeng
    Li, Linfeng
    Huang, Haitao
    Zhang, Xiaohong
    Yan, Jun
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 419 - 434
  • [9] Semi-supervised regression trees with application to QSAR modelling
    Levatic, Jurica
    Ceci, Michelangelo
    Stepisnik, Tomaz
    Dzeroski, Saso
    Kocev, Dragi
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 158 (158)
  • [10] Detection of Review Abuse via Semi-Supervised Binary Multi-Target Tensor Decomposition
    Yelundur, Anil R.
    Chaoji, Vineet
    Mishra, Bamdev
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 2134 - 2144