Fuzzy Support Vector Machine for Microarray Imbalanced Data Classification

被引:2
|
作者
Ladayya, Faroh [1 ]
Purnami, Santi Wulan [1 ]
Irhamah [1 ]
机构
[1] Inst Teknol Sepuluh Nopember, Dept Stat, Kampus ITS Sukolilo, Surabaya 60111, Indonesia
关键词
CANCER; PREDICTION;
D O I
10.1063/1.5012168
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.
引用
下载
收藏
页数:10
相关论文
共 50 条
  • [41] Imbalanced data classification based on scaling kernel-based support vector machine
    Zhang, Yong
    Fu, Panpan
    Liu, Wenzhe
    Chen, Guolong
    NEURAL COMPUTING & APPLICATIONS, 2014, 25 (3-4): : 927 - 935
  • [42] Weighted support vector machine for extremely imbalanced data
    Mun, Jongmin
    Bang, Sungwan
    Kim, Jaeoh
    Computational Statistics and Data Analysis, 2025, 203
  • [43] Huberized multiclass support vector machine for microarray classification
    Li J.-T.
    Jia Y.-M.
    Zidonghua Xuebao/ Acta Automatica Sinica, 2010, 36 (03): : 399 - 405
  • [44] Data Classification with Support Vector Machine and Generalized Support Vector Machine
    Qi, Xiaomin
    Silvestrov, Sergei
    Nazir, Talat
    ICNPAA 2016 WORLD CONGRESS: 11TH INTERNATIONAL CONFERENCE ON MATHEMATICAL PROBLEMS IN ENGINEERING, AEROSPACE AND SCIENCES, 2017, 1798
  • [45] Performance of Support Vector Machine in Imbalanced Data Set
    Novakovic, Jasmina
    Markovic, Suzana
    2020 19TH INTERNATIONAL SYMPOSIUM INFOTEH-JAHORINA (INFOTEH), 2020,
  • [46] An improved Support Vector Machine for the classification of imbalanced biological datasets
    Wang, Haiying
    Zheng, Huiru
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF THEORETICAL AND METHODOLOGICAL ISSUES, 2008, 5226 : 63 - +
  • [47] Iterative fuzzy support vector machine classification
    Shilton, Alistair
    Lai, Daniel T. H.
    2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 1396 - 1401
  • [48] Affinity and class probability-based fuzzy support vector machine for imbalanced data sets
    Tao, Xinmin
    Li, Qing
    Ren, Chao
    Guo, Wenjie
    He, Qing
    Liu, Rui
    Zou, Junrong
    NEURAL NETWORKS, 2020, 122 (122) : 289 - 307
  • [49] Two-stage gene selection for support vector machine classification of microarray data
    Xia, Xiao-Lei
    Li, Kang
    Irwin, George W.
    INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL, 2009, 8 (02) : 164 - 171
  • [50] ESVM: Evolutionary support vector machine for automatic feature selection and classification of microarray data
    Huang, Hui-Ling
    Chang, Fang-Lin
    BIOSYSTEMS, 2007, 90 (02) : 516 - 528