Semi-supervised learning using multiple clusterings with limited labeled data

被引:36
|
作者
Forestier, Germain [1 ]
Wemmert, Cedric [2 ]
机构
[1] Univ Haute Alsace, MIPS, Mulhouse, France
[2] Univ Strasbourg, ICube, Strasbourg, France
关键词
Semi-supervised learning; Classification; Pattern recognition; Remote sensing; UNLABELED DATA; CLASSIFICATION; FRAMEWORK; DIVERSITY; ENSEMBLE;
D O I
10.1016/j.ins.2016.04.040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Supervised classification consists in learning a predictive model using a set of labeled samples. It is accepted that predictive models accuracy usually increases as more labeled samples are available. Labeled samples are generally difficult to obtain as the labeling step if often performed manually. On the contrary, unlabeled samples are easily available. As the labeling task is tedious and time consuming, users generally provide a very limited number of labeled objects. However, designing approaches able to work efficiently with a very limited number of labeled samples is highly challenging. In this context, semi-supervised approaches have been proposed to leverage from both labeled and unlabeled data. In this paper, we focus on cases where the number of labeled samples is very limited. We review and formalize eight semi-supervised learning algorithms and introduce a new method that combine supervised and unsupervised learning in order to use both labeled and unlabeled data. The main idea of this method is to produce new features derived from a first step of data clustering. These features are then used to enrich the description of the input data leading to a better use of the data distribution. The efficiency of all the methods is compared on various artificial, UCI datasets, and on the classification of a very high resolution remote sensing image. The experiments reveal that our method shows good results, especially when the number of labeled sample is very limited. It also confirms that combining labeled and unlabeled data is very useful in pattern recognition. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:48 / 65
页数:18
相关论文
共 50 条
  • [1] Semi-supervised classification using multiple clusterings
    Yu G.X.
    Feng L.
    Yao G.J.
    Wang J.
    [J]. Wang, J. (kingjun@swu.edu.cn), 1600, Izdatel'stvo Nauka (26): : 681 - 687
  • [2] SEMI-SUPERVISED HIERARCHY LEARNING USING MULTIPLE-LABELED DATA
    Javadi, Ailar
    Gray, Alexander
    Anderson, David
    Berisha, Visar
    [J]. 2011 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2011,
  • [3] Identification of coal structures by semi-supervised learning based on limited labeled logging data
    Shi, Jinxiong
    Zhao, Xiangyuan
    Zeng, Lianbo
    Zhang, Yunzhao
    Dong, Shaoqun
    [J]. FUEL, 2023, 337
  • [4] Sentiment analysis using semi-supervised learning with few labeled data
    Pan, Yuhao
    Chen, Zhiqun
    Suzuki, Yoshimi
    Fukumoto, Fumiyo
    Nishizaki, Hiromitsu
    [J]. 2020 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW 2020), 2020, : 231 - 234
  • [5] A semi-supervised clustering approach using labeled data
    Taghizabet, A.
    Tanha, J.
    Amini, A.
    Mohammadzadeh, J.
    [J]. SCIENTIA IRANICA, 2023, 30 (01) : 104 - 115
  • [6] Craniomaxillofacial landmarks detection in CT scans with limited labeled data via semi-supervised learning
    Tao, Leran
    Zhang, Xu
    Yang, Yang
    Cheng, Mengjia
    Zhang, Rongbin
    Qian, Hongjun
    Wen, Yaofeng
    Yu, Hongbo
    [J]. HELIYON, 2024, 10 (14)
  • [7] Semi-supervised Learning for Sentiment Classification using Small Number of Labeled Data
    Lee, Vivian Lay Shan
    Gan, Keng Hoon
    Tan, Tien Ping
    Abdullah, Rosni
    [J]. FIFTH INFORMATION SYSTEMS INTERNATIONAL CONFERENCE, 2019, 161 : 577 - 584
  • [8] A Semi-Supervised Learning Method to Remedy the Lack of Labeled Data
    Nhut-Quang Nguyen
    Thanh-Sach Le
    [J]. 2021 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND APPLICATIONS (ACOMP 2021), 2021, : 78 - 84
  • [9] Semi-supervised learning from unbalanced labeled data: An improvement
    Huang, Te-Ming
    Kecman, Vojislav
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2006, 10 (01) : 21 - 27
  • [10] Semi-supervised learning from unbalanced labeled data - An improvement
    Huang, TM
    Kecman, V
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS, 2004, 3215 : 802 - 808