Data preprocessing in semi-supervised SVM classification

被引:16
|
作者
Astorino, A. [2 ]
Gorgone, E. [1 ]
Gaudioso, M. [1 ]
Pallaschke, D. [3 ]
机构
[1] Univ Calabria, Dipartimento Elettron Informat & Sistemist, I-87036 Arcavacata Di Rende, CS, Italy
[2] CNR, Ist Calcolo & Reti Ad Alte Prestaz, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Karlsruhe, Inst Operat Res, D-76128 Karlsruhe, Germany
关键词
data classification; semi-supervised learning; SVM; nonsmooth optimization; OPTIMIZATION TECHNIQUES;
D O I
10.1080/02331931003692557
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The literature in the area of the semi-supervised binary classification has demonstrated that useful information can be gathered not only from those samples whose class membership is known in advance, but also from the unlabelled ones. In fact, in the support vector machine, semi-supervised models with both labelled and unlabelled samples contribute to the definition of an appropriate optimization model for finding a good quality separating hyperplane. In particular, the optimization approaches which have been devised in this context are basically of two types: a mixed integer linear programming problem, and a continuous optimization problem characterized by an objective function which is nonsmooth and nonconvex. Both such problems are hard to solve whenever the number of the unlabelled points increases. In this article, we present a data preprocessing technique which has the objective of reducing the number of unlabelled points to enter the computational model, without worsening too much the classification performance of the overall process. The approach is based on the concept of separating sets and can be implemented with a reasonable computational effort. The results of the numerical experiments on several benchmark datasets are also reported.
引用
收藏
页码:143 / 151
页数:9
相关论文
共 50 条
  • [31] Watersheds for Semi-Supervised Classification
    Challa, Aditya
    Danda, Sravan
    Sagar, B. S. Daya
    Najman, Laurent
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (05) : 720 - 724
  • [32] Semi-supervised classification trees
    Jurica Levatić
    Michelangelo Ceci
    Dragi Kocev
    Sašo Džeroski
    Journal of Intelligent Information Systems, 2017, 49 : 461 - 486
  • [33] A Semi-supervised Anomaly Detection Method for Wind Farm Power Data Preprocessing
    Zhou, Yifan
    Hu, Wei
    Min, Yong
    Zheng, Le
    Liu, Baisi
    Yu, Rui
    Dong, Yu
    2017 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, 2017,
  • [34] Simple nonparallel laplacian SVM for semi-supervised learning on binary classification problem
    Zhao, Xi
    Bai, Qiyu
    Bai, Shiguo
    INTELLIGENT DATA ANALYSIS, 2016, 20 (06) : 1307 - 1328
  • [35] Optimized time-frequency features and semi-supervised SVM to heartbeat classification
    Lekhal, Redouane
    Zidelmal, Zahia
    Ould-Abdesslam, Djaffar
    SIGNAL IMAGE AND VIDEO PROCESSING, 2020, 14 (07) : 1471 - 1478
  • [36] A New SVM Method for Short Text Classification Based on Semi-Supervised Learning
    Yin, Chunyong
    Xiang, Jun
    Zhang, Hui
    Wang, Jin
    Yin, Zhichao
    Kim, Jeong-Uk
    2015 4TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION TECHNOLOGY AND SENSOR APPLICATION (AITS), 2015, : 100 - 103
  • [37] SEMI-SUPERVISED RADIO TRANSMITTER CLASSIFICATION BASED ON ELASTIC SPARSITY REGULARIZED SVM
    Hu Guyu
    Gong Yong
    Chen Yande
    Pan Zhisong
    Deng Zhantao
    Journal of Electronics(China), 2012, 29 (06) : 501 - 508
  • [38] Manifold based twin parametric-margin SVM for semi-supervised classification
    Chen, W. (wjcper2008@126.com), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [39] Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data
    Sakai, Tomoya
    du Plessis, Marthinus Christoffel
    Niu, Gang
    Sugiyama, Masashi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [40] Sparsity regularization path for semi-supervised SVM
    Gasso, G.
    Zapien, K.
    Canu, S.
    ICMLA 2007: SIXTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2007, : 25 - +