Support vector machine and its bias correction in high-dimension, low-sample-size settings

被引:14
|
作者
Nakayama, Yugo [1 ]
Yata, Kazuyoshi [2 ]
Aoshima, Makoto [2 ]
机构
[1] Univ Tsukuba, Grad Sch Pure & Appl Sci, Ibaraki, Japan
[2] Univ Tsukuba, Inst Math, Ibaraki 3058571, Japan
基金
日本学术振兴会;
关键词
Distance-based classifier; HDLSS; Imbalanced data; Large p small n; Multiclass classification; CLASSIFICATION; CLASSIFIERS; MULTICLASS;
D O I
10.1016/j.jspi.2017.05.005
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we consider asymptotic properties of the support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. We show that the hard-margin linear SVM holds a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under certain severe conditions. We show that the SVM is very biased in HDLSS settings and its performance is affected by the bias directly. In order to overcome such difficulties, we propose a bias-corrected SVM (BC-SVM). We show that the BC-SVM gives preferable performances in HDLSS settings. We also discuss the SVMs in multiclass HDLSS settings. Finally, we check the performance of the classifiers in actual data analyses. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:88 / 100
页数:13
相关论文
共 50 条