Combination of Information in Labeled and Unlabeled Data via Evidence Theory

被引:4
|
作者
Huang L. [1 ]
机构
[1] Northwestern Polytechnical University, School of Automation, Xi'an
来源
关键词
Belief functions; evidence theory (ET); evidential reasoning; fuzzy c-mean (FCM) clustering; information fusion; pattern classification; semisupervised learning (SSL); two-views co-training;
D O I
10.1109/TAI.2023.3316194
中图分类号
学科分类号
摘要
For classification with few labeled and massive unlabeled patterns, co-training, which uses information in labeled and unlabeled data to classify query patterns, is often employed to train classifiers in two distinct views. The classifiers teach each other by adding high-confidence unlabeled patterns to training dataset of the other view. Whereas, the direct adding often leads to some negative influence when retraining classifiers because some patterns with wrong predictions are added into training dataset. The wrong predictions must be considered for performance improvement. To this end, we present a method called Combination of Information in Labeled and Unlabeled (CILU) data based on evidence theory to effectively extract and fuse complementary knowledge in labeled and unlabeled data. In CILU, patterns are characterized by two distinct views, and the unlabeled patterns with high-confidence predictions are first added into the other view. We can train two classifiers by few labeled training data and high-confidence unlabeled patterns in each view. The classifiers are fused by evidence theory, and their weights which aim to reduce the harmful influence of wrong predictions are learnt by constructing an objection function on labeled data. There exist some complementary information between two distinct views, so the fused classifiers in two views are also combined. In order to extract more useful information in unlabeled data, semi-supervised Fuzzy C-mean clustering paradigm is also employed to yield clustering results. For a query pattern, the classification results and clustering results obtained by combined classifiers and clustering partition are integrated to make final class decision. © 2023 IEEE.
引用
收藏
页码:2179 / 2192
页数:13
相关论文
共 50 条
  • [31] Learning from labeled and unlabeled data using random walks
    Zhou, DY
    Schölkopf, B
    PATTERN RECOGNITION, 2004, 3175 : 237 - 244
  • [32] UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
    Wang, Chengyi
    Wu, Yu
    Qian, Yao
    Kumatani, Kenichi
    Liu, Shujie
    Wei, Furu
    Zeng, Michael
    Huang, Xuedong
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [33] Unified Simultaneous Clustering and Feature Selection for Unlabeled and Labeled Data
    Han, Dongyoon
    Kim, Junmo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 6083 - 6098
  • [34] Search result refinement via machine learning from labeled-unlabeled data for meta-search
    Ozyurt, I. Burak
    Brown, Greg G.
    2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 186 - 193
  • [35] Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data
    Kim, Juhyeon
    Shin, Hyunjung
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (04) : 613 - 618
  • [36] A comparative study on the use of labeled and unlabeled data for large margin classifiers
    Takamura, H
    Okumura, M
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 456 - 465
  • [37] A PAC-style model for learning from labeled and unlabeled data
    Balcan, MF
    Blum, A
    LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 111 - 126
  • [38] Learning Instance Weighted Naive Bayes from labeled and unlabeled data
    Jiang, Liangxiao
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2012, 38 (01) : 257 - 268
  • [39] Relevance feedback algorithm based on learning from labeled and unlabeled data
    Singh, R
    Kothari, R
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 433 - 436
  • [40] The Use of Unlabeled Data versus Labeled Data for Stopping Active Learning for Text Classification
    Beatty, Garrett
    Kochis, Ethan
    Bloodgood, Michael
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2019, : 287 - 294