A multiclass boosting algorithm to labeled and unlabeled data

被引:8
|
作者
Tanha, Jafar [1 ,2 ]
机构
[1] Univ Tabriz, Elect & Comp Engn Dept, Bahman 29 St, Tabriz 193954697, Iran
[2] Inst Res Fundamental Sci IPM, Sch Comp Sci, POB 19395-5746, Tehran, Iran
关键词
Multiclass classification; Semi-supervised learning; Similarity function; Boosting; GRAPH; CLASSIFICATION; ADABOOST;
D O I
10.1007/s13042-019-00951-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article we focus on the semi-supervised learning. Semi-supervised learning typically is a learning task from both labeled and unlabeled data. We especially consider the multiclass semi-supervised classification problem. To solve the multiclass semi-supervised classification problem we propose a new multiclass loss function using new codewords. In the proposed loss function, we combine the classifier predictions, based on the labeled data, and the pairwise similarity between labeled and unlabeled examples. The main goal of the proposed loss function is to minimize the inconsistency between classifier predictions and the pairwise similarity. The proposed loss function consists of two terms. The first term is the multiclass margin cost of the labeled data and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the cost of pseudo-margin on unlabeled data. We then derive a new multiclass boosting algorithm from the proposed risk function, called GMSB. The derived algorithm also uses a set optimal similarity functions for a given dataset. The results of our experiments on a number of UCI and real-world biological, text, and image datasets show that GMSB outperforms the state-of-the-art boosting methods to multiclass semi-supervised learning.
引用
收藏
页码:3647 / 3665
页数:19
相关论文
共 50 条
  • [41] Unified Simultaneous Clustering and Feature Selection for Unlabeled and Labeled Data
    Han, Dongyoon
    Kim, Junmo
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 6083 - 6098
  • [42] Combination of Information in Labeled and Unlabeled Data via Evidence Theory
    Huang L.
    [J]. IEEE Transactions on Artificial Intelligence, 2024, 5 (05): : 1 - 13
  • [43] Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data
    Chen, Yuhao
    Tan, Xin
    Zhao, Borui
    Chen, Zhaowei
    Song, Renjie
    Liang, Jiajun
    Lu, Xuequan
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7548 - 7557
  • [44] An Efficient Labeled/Unlabeled Random Finite Set Algorithm for Multiobject Tracking
    Kropfreiter, Thomas
    Meyer, Florian
    Hlawatsch, Franz
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2022, 58 (06) : 5256 - 5275
  • [45] Multiclass boosting for weak classifiers
    Eibl, G
    Pfeiffer, KP
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2005, 6 : 189 - 210
  • [46] Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data
    Kim, Juhyeon
    Shin, Hyunjung
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (04) : 613 - 618
  • [47] Multiclass boosting for weak classifiers
    Eibl, Günther
    Pfeiffer, Karl-Peter
    [J]. Journal of Machine Learning Research, 2005, 6
  • [48] Transforming examples for multiclass boosting
    Bylander, Tom
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2010, 22 (01) : 53 - 65
  • [49] Online Agnostic Multiclass Boosting
    Raman, Vinod
    Tewari, Ambuj
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [50] A PAC-style model for learning from labeled and unlabeled data
    Balcan, MF
    Blum, A
    [J]. LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 111 - 126