Graph-based boosting algorithm to learn labeled and unlabeled data

被引:8
|
作者
Liu, Zheng [1 ]
Jin, Wei [1 ]
Mu, Ying [1 ]
机构
[1] Zhejiang Univ, Res Ctr Analyt Instrumentat, Inst Cyber Syst & Control, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China
基金
国家重点研发计划;
关键词
Graph; Boosting; Semi-supervised learning; Imbalance learning; SEMI-SUPERVISED CLASSIFICATION; MACHINE; PREDICTION; SELECTION; ADABOOST;
D O I
10.1016/j.patcog.2020.107417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble learning is an effective technique to learn the information of data by combining multiple models. But usually the combined models are supervised learning algorithms which need a lot of labeled data to tune their parameters. Some ensemble learning algorithms were proposed to exploit the information of unlabeled data. These methods had to learn the samples with pseudo-labels due to the scarcity of labeled data. But it's inevitable for the samples with pseudo-labels to bring wrong information during training process. In this paper, we will propose a novel graph-based boosting (GBB) algorithm to learn labeled and unlabeled data. GBB is a framework combining many models linearly. And pseudo-labels will not occur during training process. GBB will assign a new weighting vector for the labeled samples and a transformed similarity matrix for all samples to train the combined model at each iteration. We also extend GBB, termed as weighted GBB (WGBB), to learn imbalanced data by adding a weighting vector for the labeled data. Finally, 14 relatively balanced datasets and 22 imbalanced datasets are used to validate the performances of GBB and WGBB respectively. Experimental results illustrate that GBB can achieve a competitive performance and WGBB has an obvious advantage to handle classification problem of imbalanced data, comparing with other related algorithms. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [2] A multiclass boosting algorithm to labeled and unlabeled data
    Jafar Tanha
    [J]. International Journal of Machine Learning and Cybernetics, 2019, 10 : 3647 - 3665
  • [3] Combining labeled and unlabeled data with graph embedding
    Zhao, Haitao
    [J]. NEUROCOMPUTING, 2006, 69 (16-18) : 2385 - 2389
  • [4] A graph-based approach for positive and unlabeled learning
    Carnevali, Julio César
    Geraldeli Rossi, Rafael
    Milios, Evangelos
    de Andrade Lopes, Alneu
    [J]. Information Sciences, 2021, 580 : 655 - 672
  • [5] A graph-based approach for positive and unlabeled learning
    Carnevali, Julio Cesar
    Rossi, Rafael Geraldeli
    Milios, Evangelos
    Lopes, Alneu de Andrade
    [J]. INFORMATION SCIENCES, 2021, 580 : 655 - 672
  • [6] A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
    Taheriyan, Mohsen
    Knoblock, Craig A.
    Szekely, Pedro
    Ambite, Jose Luis
    [J]. SEMANTIC WEB - ISWC 2013, PART I, 2013, 8218 : 607 - 623
  • [7] Graph-based data mining algorithm research
    Hu, Zuoting
    Dong, Lanfang
    Wang, Xun
    [J]. Jisuanji Gongcheng/Computer Engineering, 2006, 32 (03): : 76 - 78
  • [8] Relevance feedback algorithm based on learning from labeled and unlabeled data
    Singh, R
    Kothari, R
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 433 - 436
  • [9] Knowledge Graph-based Algorithm for Text Data Mining
    Zhao, Yu-Feng
    He, Jie
    [J]. Journal of Network Intelligence, 2024, 9 (03): : 1892 - 1906
  • [10] A Boosting Algorithm for Training from Only Unlabeled Data
    Zhao, Yawen
    Yue, Lin
    Xu, Miao
    [J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II, 2022, 13726 : 459 - 473