A multiclass boosting algorithm to labeled and unlabeled data

被引:8
|
作者
Tanha, Jafar [1 ,2 ]
机构
[1] Univ Tabriz, Elect & Comp Engn Dept, Bahman 29 St, Tabriz 193954697, Iran
[2] Inst Res Fundamental Sci IPM, Sch Comp Sci, POB 19395-5746, Tehran, Iran
关键词
Multiclass classification; Semi-supervised learning; Similarity function; Boosting; GRAPH; CLASSIFICATION; ADABOOST;
D O I
10.1007/s13042-019-00951-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article we focus on the semi-supervised learning. Semi-supervised learning typically is a learning task from both labeled and unlabeled data. We especially consider the multiclass semi-supervised classification problem. To solve the multiclass semi-supervised classification problem we propose a new multiclass loss function using new codewords. In the proposed loss function, we combine the classifier predictions, based on the labeled data, and the pairwise similarity between labeled and unlabeled examples. The main goal of the proposed loss function is to minimize the inconsistency between classifier predictions and the pairwise similarity. The proposed loss function consists of two terms. The first term is the multiclass margin cost of the labeled data and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the cost of pseudo-margin on unlabeled data. We then derive a new multiclass boosting algorithm from the proposed risk function, called GMSB. The derived algorithm also uses a set optimal similarity functions for a given dataset. The results of our experiments on a number of UCI and real-world biological, text, and image datasets show that GMSB outperforms the state-of-the-art boosting methods to multiclass semi-supervised learning.
引用
收藏
页码:3647 / 3665
页数:19
相关论文
共 50 条
  • [1] A multiclass boosting algorithm to labeled and unlabeled data
    Jafar Tanha
    [J]. International Journal of Machine Learning and Cybernetics, 2019, 10 : 3647 - 3665
  • [2] Graph-based boosting algorithm to learn labeled and unlabeled data
    Liu, Zheng
    Jin, Wei
    Mu, Ying
    [J]. PATTERN RECOGNITION, 2020, 106
  • [3] A Boosting Algorithm for Training from Only Unlabeled Data
    Zhao, Yawen
    Yue, Lin
    Xu, Miao
    [J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II, 2022, 13726 : 459 - 473
  • [4] An Adaptive Multiclass Boosting Algorithm for Classification
    Wang, Shixun
    Pan, Peng
    Lu, Yansheng
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1159 - 1166
  • [5] Relevance feedback algorithm based on learning from labeled and unlabeled data
    Singh, R
    Kothari, R
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 433 - 436
  • [6] Learning from labeled and unlabeled data
    Kothari, R
    Jain, V
    [J]. PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 2803 - 2808
  • [7] Labeled and unlabeled data in text categorization
    Silva, C
    Ribeiro, B
    [J]. 2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 2971 - 2976
  • [8] Feature extractions using labeled and unlabeled data
    Kuo, BC
    Shen, TW
    Chang, CH
    Hung, CC
    [J]. IGARSS 2005: IEEE International Geoscience and Remote Sensing Symposium, Vols 1-8, Proceedings, 2005, : 1257 - 1260
  • [9] Combining labeled and unlabeled data with graph embedding
    Zhao, Haitao
    [J]. NEUROCOMPUTING, 2006, 69 (16-18) : 2385 - 2389
  • [10] Combining labeled and unlabeled data for spam classification
    Yang, Zhen
    Wang, Jian
    Xu, Weiran
    Guo, Jun
    [J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2007, 14 : 1476 - 1479