A multiclass boosting algorithm to labeled and unlabeled data

被引：8

作者：

Tanha, Jafar ^{[1
,2
]}

机构：

[1] Univ Tabriz, Elect & Comp Engn Dept, Bahman 29 St, Tabriz 193954697, Iran

[2] Inst Res Fundamental Sci IPM, Sch Comp Sci, POB 19395-5746, Tehran, Iran

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2019年 / 10卷 / 12期

关键词：

Multiclass classification; Semi-supervised learning; Similarity function; Boosting; GRAPH; CLASSIFICATION; ADABOOST;

D O I：

10.1007/s13042-019-00951-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article we focus on the semi-supervised learning. Semi-supervised learning typically is a learning task from both labeled and unlabeled data. We especially consider the multiclass semi-supervised classification problem. To solve the multiclass semi-supervised classification problem we propose a new multiclass loss function using new codewords. In the proposed loss function, we combine the classifier predictions, based on the labeled data, and the pairwise similarity between labeled and unlabeled examples. The main goal of the proposed loss function is to minimize the inconsistency between classifier predictions and the pairwise similarity. The proposed loss function consists of two terms. The first term is the multiclass margin cost of the labeled data and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the cost of pseudo-margin on unlabeled data. We then derive a new multiclass boosting algorithm from the proposed risk function, called GMSB. The derived algorithm also uses a set optimal similarity functions for a given dataset. The results of our experiments on a number of UCI and real-world biological, text, and image datasets show that GMSB outperforms the state-of-the-art boosting methods to multiclass semi-supervised learning.

引用

页码：3647 / 3665

页数：19

共 50 条

[1] A multiclass boosting algorithm to labeled and unlabeled data
Jafar Tanha
[J]. International Journal of Machine Learning and Cybernetics, 2019, 10 : 3647 - 3665
[2] Graph-based boosting algorithm to learn labeled and unlabeled data
Liu, Zheng
Jin, Wei
Mu, Ying
[J]. PATTERN RECOGNITION, 2020, 106
[3] A Boosting Algorithm for Training from Only Unlabeled Data
Zhao, Yawen
Yue, Lin
Xu, Miao
[J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II, 2022, 13726 : 459 - 473
[4] An Adaptive Multiclass Boosting Algorithm for Classification
Wang, Shixun
Pan, Peng
Lu, Yansheng
[J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1159 - 1166
[5] Relevance feedback algorithm based on learning from labeled and unlabeled data
Singh, R
Kothari, R
[J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 433 - 436
[6] Learning from labeled and unlabeled data
Kothari, R
Jain, V
[J]. PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 2803 - 2808
[7] Labeled and unlabeled data in text categorization
Silva, C
Ribeiro, B
[J]. 2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 2971 - 2976
[8] Feature extractions using labeled and unlabeled data
Kuo, BC
Shen, TW
Chang, CH
Hung, CC
[J]. IGARSS 2005: IEEE International Geoscience and Remote Sensing Symposium, Vols 1-8, Proceedings, 2005, : 1257 - 1260
[9] Combining labeled and unlabeled data with graph embedding
Zhao, Haitao
[J]. NEUROCOMPUTING, 2006, 69 (16-18) : 2385 - 2389
[10] Combining labeled and unlabeled data for spam classification
Yang, Zhen
Wang, Jian
Xu, Weiran
Guo, Jun
[J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2007, 14 : 1476 - 1479

← 1 2 3 4 5 →