Data preprocessing in semi-supervised SVM classification

被引:16
|
作者
Astorino, A. [2 ]
Gorgone, E. [1 ]
Gaudioso, M. [1 ]
Pallaschke, D. [3 ]
机构
[1] Univ Calabria, Dipartimento Elettron Informat & Sistemist, I-87036 Arcavacata Di Rende, CS, Italy
[2] CNR, Ist Calcolo & Reti Ad Alte Prestaz, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Karlsruhe, Inst Operat Res, D-76128 Karlsruhe, Germany
关键词
data classification; semi-supervised learning; SVM; nonsmooth optimization; OPTIMIZATION TECHNIQUES;
D O I
10.1080/02331931003692557
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The literature in the area of the semi-supervised binary classification has demonstrated that useful information can be gathered not only from those samples whose class membership is known in advance, but also from the unlabelled ones. In fact, in the support vector machine, semi-supervised models with both labelled and unlabelled samples contribute to the definition of an appropriate optimization model for finding a good quality separating hyperplane. In particular, the optimization approaches which have been devised in this context are basically of two types: a mixed integer linear programming problem, and a continuous optimization problem characterized by an objective function which is nonsmooth and nonconvex. Both such problems are hard to solve whenever the number of the unlabelled points increases. In this article, we present a data preprocessing technique which has the objective of reducing the number of unlabelled points to enter the computational model, without worsening too much the classification performance of the overall process. The approach is based on the concept of separating sets and can be implemented with a reasonable computational effort. The results of the numerical experiments on several benchmark datasets are also reported.
引用
收藏
页码:143 / 151
页数:9
相关论文
共 50 条
  • [41] Semi-Supervised SVM With Extended Hidden Features
    Dong, Aimei
    Chung, Fu-Lai
    Deng, Zhaohong
    Wang, Shitong
    IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (12) : 2924 - 2937
  • [42] An Efficient Semi-Supervised SVM for Anomaly Detection
    Kim, Junae
    Montague, Paul
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2843 - 2850
  • [43] Learning semi-supervised SVM with genetic algorithm
    Adankon, Mathias M.
    Cheriet, Mohamed
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1825 - 1830
  • [44] A Semi-supervised SVM framework for Character Recognition
    Arora, Amit
    Namboodiri, Anoop M.
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1105 - 1109
  • [45] Semi-supervised cloud screening with Laplacian SVM
    Gomez-Chova, Luis
    Camps-Valls, Gustavo
    Munoz-Mari, Jordi
    Calpe, Javier
    IGARSS: 2007 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-12: SENSING AND UNDERSTANDING OUR PLANET, 2007, : 1521 - 1524
  • [46] Semi-supervised Learning for SVM-KNN
    Li, Kunlun
    Luo, Xuerong
    Jin, Ming
    JOURNAL OF COMPUTERS, 2010, 5 (05) : 671 - 678
  • [47] On Semi-supervised Learning with Sparse Data Handling for Educational Data Classification
    Vo Thi Ngoc Chau
    Nguyen Hua Phung
    FUTURE DATA AND SECURITY ENGINEERING, 2017, 10646 : 154 - 167
  • [48] Review of ensemble classification over data streams based on supervised and semi-supervised
    Han, Meng
    Li, Xiaojuan
    Wang, Le
    Zhang, Ni
    Cheng, Haodong
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 3859 - 3878
  • [49] A novel semi-supervised classification approach for evolving data streams
    Liao, Guobo
    Zhang, Peng
    Yin, Hongpeng
    Deng, Xuanhong
    Li, Yanxia
    Zhou, Han
    Zhao, Dandan
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 215
  • [50] Semi-supervised Learning for Multi-component Data Classification
    Fujino, Akinori
    Ueda, Naonori
    Saito, Kazumi
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2754 - 2759