Asymptotic properties of distance-weighted discrimination and its bias correction for high-dimension, low-sample-size data

被引:0
|
作者
Kento Egashira
Kazuyoshi Yata
Makoto Aoshima
机构
[1] University of Tsukuba,Degree Programs in Pure and Applied Sciences, Graduate School of Science and Technology
[2] University of Tsukuba,Institute of Mathematics
关键词
Bias-corrected DWD; Discriminant analysis; HDLSS; Large ; small ; Weighted DWD;
D O I
暂无
中图分类号
学科分类号
摘要
While distance-weighted discrimination (DWD) was proposed to improve the support vector machine in high-dimensional settings, it is known that the DWD is quite sensitive to the imbalanced ratio of sample sizes. In this paper, we study asymptotic properties of the DWD in high-dimension, low-sample-size (HDLSS) settings. We show that the DWD includes a huge bias caused by a heterogeneity of covariance matrices as well as sample imbalance. We propose a bias-corrected DWD (BC-DWD) and show that the BC-DWD can enjoy consistency properties about misclassification rates. We also consider the weighted DWD (WDWD) and propose an optimal choice of weights in the WDWD. Finally, we discuss performances of the BC-DWD and the WDWD with the optimal weights in numerical simulations and actual data analyses.
引用
收藏
页码:821 / 840
页数:19
相关论文
共 50 条
  • [31] Random forest kernel for high-dimension low sample size classification
    Cavalheiro, Lucca Portes
    Bernard, Simon
    Barddal, Jean Paul
    Heutte, Laurent
    [J]. STATISTICS AND COMPUTING, 2024, 34 (01)
  • [32] Comparison of binary discrimination methods for high dimension low sample size data
    Bolivar-Cime, A.
    Marron, J. S.
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 115 : 108 - 121
  • [33] Structural Classification based Correlation and its Application to Principal Component Analysis for High-Dimension Low-Sample Size Data
    Sato-Ilic, Mika
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2012,
  • [34] CLUSTERING HIGH DIMENSION, LOW SAMPLE SIZE DATA USING THE MAXIMAL DATA PILING DISTANCE
    Ahn, Jeongyoun
    Lee, Myung Hee
    Yoon, Young Joo
    [J]. STATISTICA SINICA, 2012, 22 (02) : 443 - 464
  • [35] Distance-based outlier detection for high dimension, low sample size data
    Ahn, Jeongyoun
    Lee, Myung Hee
    Lee, Jung Ae
    [J]. JOURNAL OF APPLIED STATISTICS, 2019, 46 (01) : 13 - 29
  • [36] Improvement of Classification Performance in High-Dimension Low-Sample-Size Modeling by Sparse Functional Connectivity States in Subjects with Attention Deficit-Hyperactivity Disorder and Healthy Controls
    Zolghadr, Zahra
    Batouli, Seyed Amirhossein
    Tehrani-Doost, Mehdi
    Shafaghi, Lida
    Hadjighassem, Mahmoudreza
    Majd, Hamid Alavi
    Mehrabi, Yadollah
    [J]. ARCHIVES OF NEUROSCIENCE, 2023, 10 (02)
  • [37] On asymptotic normality of cross data matrix-based PCA in high dimension low sample size
    Wang, Shao-Hsuan
    Huang, Su-Yun
    Chen, Ting-Li
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 175
  • [38] On Perfect Clustering of High Dimension, Low Sample Size Data
    Sarkar, Soham
    Ghosh, Anil K.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (09) : 2257 - 2272
  • [39] Geometric representation of high dimension, low sample size data
    Hall, P
    Marron, JS
    Neeman, A
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2005, 67 : 427 - 444
  • [40] Maximum Projection Distance Classifier for High Dimension and Low Sample Size Problems
    Zhang, Zhiwang
    He, Jing
    Cao, Jie
    Li, Shuqing
    Ji, Yimu
    Qian, Gang
    Li, Xingsen
    Zhang, Kai
    Wang, Pingjiang
    [J]. PROCEEDINGS OF 2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS AND SPECIAL SESSIONS: (WI-IAT WORKSHOP/SPECIAL SESSION 2021), 2021, : 334 - 339