Data preprocessing in semi-supervised SVM classification

被引:16
|
作者
Astorino, A. [2 ]
Gorgone, E. [1 ]
Gaudioso, M. [1 ]
Pallaschke, D. [3 ]
机构
[1] Univ Calabria, Dipartimento Elettron Informat & Sistemist, I-87036 Arcavacata Di Rende, CS, Italy
[2] CNR, Ist Calcolo & Reti Ad Alte Prestaz, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Karlsruhe, Inst Operat Res, D-76128 Karlsruhe, Germany
关键词
data classification; semi-supervised learning; SVM; nonsmooth optimization; OPTIMIZATION TECHNIQUES;
D O I
10.1080/02331931003692557
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The literature in the area of the semi-supervised binary classification has demonstrated that useful information can be gathered not only from those samples whose class membership is known in advance, but also from the unlabelled ones. In fact, in the support vector machine, semi-supervised models with both labelled and unlabelled samples contribute to the definition of an appropriate optimization model for finding a good quality separating hyperplane. In particular, the optimization approaches which have been devised in this context are basically of two types: a mixed integer linear programming problem, and a continuous optimization problem characterized by an objective function which is nonsmooth and nonconvex. Both such problems are hard to solve whenever the number of the unlabelled points increases. In this article, we present a data preprocessing technique which has the objective of reducing the number of unlabelled points to enter the computational model, without worsening too much the classification performance of the overall process. The approach is based on the concept of separating sets and can be implemented with a reasonable computational effort. The results of the numerical experiments on several benchmark datasets are also reported.
引用
收藏
页码:143 / 151
页数:9
相关论文
共 50 条
  • [1] Classification of hyperspectral data by continuation semi-supervised SVM
    Chi, Mingmin
    Bruzzone, Lorenzo
    IGARSS: 2007 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-12: SENSING AND UNDERSTANDING OUR PLANET, 2007, : 3794 - +
  • [2] Fuzzy Preprocessing for Semi-supervised Image Classification in Modern Industry
    Hurtik, Petr
    Molek, Vojtech
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2019, PT II, 2019, 11507 : 3 - 13
  • [3] Semi-Supervised Classification on Evolutionary Data
    Jia, Yangqing
    Yan, Shuicheng
    Zhang, Changshui
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1083 - 1088
  • [4] Performance Evaluation of SVM Based Semi-supervised Classification Algorithm
    Chaudhari, Narendra S.
    Tiwari, Aruna
    Thomas, Jaya
    2008 10TH INTERNATIONAL CONFERENCE ON CONTROL AUTOMATION ROBOTICS & VISION: ICARV 2008, VOLS 1-4, 2008, : 1942 - +
  • [5] Subspace Divided Semi-Supervised SVM Classification for Hyperspectral Images
    She, Hong-wei
    Meng, Qing-jie
    Ren, Yue-mei
    INTELLIGENT SCIENCE AND INTELLIGENT DATA ENGINEERING, ISCIDE 2011, 2012, 7202 : 265 - 272
  • [6] Semi-supervised SVM for individual tree crown species classification
    Dalponte, Michele
    Ene, Liviu Theodor
    Marconcini, Mattia
    Gobakken, Terje
    Naesset, Erik
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2015, 110 : 77 - 87
  • [7] A Semi-Supervised Learning Algorithm for Data Classification
    Kuo, Cheng-Chien
    Shieh, Horng-Lin
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2015, 29 (05)
  • [8] A SEMI-SUPERVISED LEARNING ALGORITHM BASED ON SVM FOR IMBALANCED DATA
    Du, Limin
    Xu, Yang
    He, Xingxing
    UNCERTAINTY MODELLING IN KNOWLEDGE ENGINEERING AND DECISION MAKING, 2016, 10 : 194 - 200
  • [9] Semi-supervised kernel based progressive SVM for coal mine gas safety data classification
    Zhao, Z. (zhikaizh@gmail.com), 1771, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (09):
  • [10] Classifying Limited Resource Data Using Semi-supervised SVM
    Veeranjaneyulu N.
    Bodapati J.D.
    Buradagunta S.
    Ingenierie des Systemes d'Information, 2020, 25 (03): : 391 - 395