Semi-supervised learning using multiple clusterings with limited labeled data

被引:36
|
作者
Forestier, Germain [1 ]
Wemmert, Cedric [2 ]
机构
[1] Univ Haute Alsace, MIPS, Mulhouse, France
[2] Univ Strasbourg, ICube, Strasbourg, France
关键词
Semi-supervised learning; Classification; Pattern recognition; Remote sensing; UNLABELED DATA; CLASSIFICATION; FRAMEWORK; DIVERSITY; ENSEMBLE;
D O I
10.1016/j.ins.2016.04.040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Supervised classification consists in learning a predictive model using a set of labeled samples. It is accepted that predictive models accuracy usually increases as more labeled samples are available. Labeled samples are generally difficult to obtain as the labeling step if often performed manually. On the contrary, unlabeled samples are easily available. As the labeling task is tedious and time consuming, users generally provide a very limited number of labeled objects. However, designing approaches able to work efficiently with a very limited number of labeled samples is highly challenging. In this context, semi-supervised approaches have been proposed to leverage from both labeled and unlabeled data. In this paper, we focus on cases where the number of labeled samples is very limited. We review and formalize eight semi-supervised learning algorithms and introduce a new method that combine supervised and unsupervised learning in order to use both labeled and unlabeled data. The main idea of this method is to produce new features derived from a first step of data clustering. These features are then used to enrich the description of the input data leading to a better use of the data distribution. The efficiency of all the methods is compared on various artificial, UCI datasets, and on the classification of a very high resolution remote sensing image. The experiments reveal that our method shows good results, especially when the number of labeled sample is very limited. It also confirms that combining labeled and unlabeled data is very useful in pattern recognition. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:48 / 65
页数:18
相关论文
共 50 条
  • [41] Learning And Predicting Diabetes Data Sets Using Semi-Supervised Learning
    Tayal, Radhika
    Shankar, Achyut
    [J]. PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 385 - 389
  • [42] Fraud Detection in Big Data using Supervised and Semi-supervised Learning Techniques
    Melo-Acosta, German E.
    Duitama-Munoz, Freddy
    Arias-Londono, Julian D.
    [J]. 2017 IEEE COLOMBIAN CONFERENCE ON COMMUNICATIONS AND COMPUTING (COLCOM), 2017,
  • [43] Semi-supervised Learning over Streaming Data using MOA
    Le Nguyen, Minh Huong
    Gomes, Heitor Murilo
    Bifet, Albert
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 553 - 562
  • [44] Big data analytics using semi-supervised learning methods
    Frumosu, Flavia D.
    Kulahci, Murat
    [J]. QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2018, 34 (07) : 1413 - 1423
  • [45] Improved Semi-Supervised Learning with Multiple Graphs
    Viswanathan, Krishnamurthy
    Sachdeva, Sushant
    Tomkins, Andrew
    Ravi, Sujith
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [46] On semi-supervised multiple representation behavior learning
    Lu, Ruqian
    Hou, Shengluan
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2020, 46
  • [47] FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness for Semi-Supervised Learning
    Huang, Zhuo
    Shen, Li
    Yu, Jun
    Han, Bo
    Liu, Tongliang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] Improving Supervised Learning with Multiple Clusterings
    Wemmert, Cedric
    Forestier, Germain
    Derivaux, Sebastien
    [J]. APPLICATIONS OF SUPERVISED AND UNSUPERVISED ENSEMBLE METHODS, 2009, 245 : 135 - 149
  • [49] Barely-Supervised Learning: Semi-supervised Learning with Very Few Labeled Images
    Lucas, Thomas
    Weinzaepfel, Philippe
    Rogez, Gregory
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1881 - 1889
  • [50] SEMI-SUPERVISED HANDWRITTEN DIGIT RECOGNITION USING VERY FEW LABELED DATA
    Van Vaerenbergh, Steven
    Santamaria, Ignacio
    Barbano, Paolo Emilio
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2136 - 2139