A survey on semi-supervised learning

被引:1277
|
作者
Van Engelen, Jesper E. [1 ]
Hoos, Holger H. [1 ,2 ]
机构
[1] Leiden Univ, Leiden Inst Adv Comp Sci, Leiden, Netherlands
[2] Univ British Columbia, Dept Comp Sci, Vancouver, BC, Canada
关键词
Semi-supervised learning; Machine learning; Classification; UNLABELED DATA; RANDOM FOREST; MANIFOLD REGULARIZATION; ROBUST; CLASSIFICATION; ALGORITHM; MACHINE; SOFTWARE; DROPOUT; GRAPH;
D O I
10.1007/s10994-019-05855-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised learning is the branch of machine learning concerned with using labelled as well as unlabelled data to perform certain learning tasks. Conceptually situated between supervised and unsupervised learning, it permits harnessing the large amounts of unlabelled data available in many use cases in combination with typically smaller sets of labelled data. In recent years, research in this area has followed the general trends observed in machine learning, with much attention directed at neural network-based models and generative learning. The literature on the topic has also expanded in volume and scope, now encompassing a broad spectrum of theory, algorithms and applications. However, no recent surveys exist to collect and organize this knowledge, impeding the ability of researchers and engineers alike to utilize it. Filling this void, we present an up-to-date overview of semi-supervised learning methods, covering earlier work as well as more recent advances. We focus primarily on semi-supervised classification, where the large majority of semi-supervised learning research takes place. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches and algorithms developed over the past two decades, with an emphasis on the most prominent and currently relevant work. Furthermore, we propose a new taxonomy of semi-supervised classification algorithms, which sheds light on the different conceptual and methodological approaches for incorporating unlabelled data into the training process. Lastly, we show how the fundamental assumptions underlying most semi-supervised learning algorithms are closely connected to each other, and how they relate to the well-known semi-supervised clustering assumption.
引用
收藏
页码:373 / 440
页数:68
相关论文
共 50 条
  • [41] Broad learning system for semi-supervised learning
    Liu, Zheng
    Huang, Shiluo
    Jin, Wei
    Mu, Ying
    NEUROCOMPUTING, 2021, 444 (444) : 38 - 47
  • [42] Semi-supervised learning by sparse representation
    Yan, Shuicheng
    Wang, Huan
    Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics, 2009, 2 : 788 - 797
  • [43] Negative sampling in semi-supervised learning
    Chen, John
    Shah, Vatsal
    Kyrillidis, Anastasios
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [44] Semi-supervised Learning with Multimodal Perturbation
    Su, Lei
    Liao, Hongzhi
    Yu, Zhengtao
    Tang, Jiahua
    ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 1, PROCEEDINGS, 2009, 5551 : 651 - +
  • [45] Semi-Supervised Learning for ECG Classification
    Rodrigues, Rui
    Couto, Paula
    2021 COMPUTING IN CARDIOLOGY (CINC), 2021,
  • [46] Augmentation Learning for Semi-Supervised Classification
    Frommknecht, Tim
    Zipf, Pedro Alves
    Fan, Quanfu
    Shvetsova, Nina
    Kuehne, Hilde
    PATTERN RECOGNITION, DAGM GCPR 2022, 2022, 13485 : 85 - 98
  • [47] Quantum semi-supervised kernel learning
    Seyran Saeedi
    Aliakbar Panahi
    Tom Arodz
    Quantum Machine Intelligence, 2021, 3
  • [48] Semi-Supervised Learning with Normalizing Flows
    Izmailov, Pavel
    Kirichenko, Polina
    Finzi, Marc
    Wilson, Andrew Gordon
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [49] A Theoretical Analysis of Semi-supervised Learning
    Fujii, Takashi
    Ito, Hidetaka
    Miyoshi, Seiji
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 28 - 36
  • [50] Semi-Supervised Learning by Gaussian Mixtures
    Choi, Byoung-Jeong
    Chae, Youn-Seok
    Choi, Woo-Young
    Park, Changyi
    Koo, Ja-Yong
    KOREAN JOURNAL OF APPLIED STATISTICS, 2008, 21 (05) : 825 - 833