Effect of finite sample size on feature selection and classification: A simulation study

被引:49
|
作者
Way, Ted W. [1 ]
Sahiner, Berkman [1 ]
Hadjiiski, Lubomir M. [1 ]
Chan, Heang-Ping [1 ]
机构
[1] Univ Michigan, Dept Radiol, Ann Arbor, MI 48109 USA
关键词
feature extraction; Gaussian distribution; image classification; medical image processing; principal component analysis; support vector machines; COMPUTER-AIDED DIAGNOSIS; SUPPORT VECTOR MACHINES; NEURAL-NETWORK CLASSIFIERS; PATTERN-RECOGNITION; CHEST RADIOGRAPHS; PERFORMANCE; SCHEMES;
D O I
10.1118/1.3284974
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Methods: Three feature selection techniques, the stepwise feature selection (SFS), sequential floating forward search (SFFS), and principal component analysis (PCA), and two commonly used classifiers, Fisher's linear discriminant analysis (LDA) and support vector machine (SVM), were investigated. Samples were drawn from multidimensional feature spaces of multivariate Gaussian distributions with equal or unequal covariance matrices and unequal means, and with equal covariance matrices and unequal means estimated from a clinical data set. Classifier performance was quantified by the area under the receiver operating characteristic curve A(z). The mean A(z) values obtained by resubstitution and hold-out methods were evaluated for training sample sizes ranging from 15 to 100 per class. The number of simulated features available for selection was chosen to be 50, 100, and 200. Results: It was found that the relative performance of the different combinations of classifier and feature selection method depends on the feature space distributions, the dimensionality, and the available training sample sizes. The LDA and SVM with radial kernel performed similarly for most of the conditions evaluated in this study, although the SVM classifier showed a slightly higher hold-out performance than LDA for some conditions and vice versa for other conditions. PCA was comparable to or better than SFS and SFFS for LDA at small samples sizes, but inferior for SVM with polynomial kernel. For the class distributions simulated from clinical data, PCA did not show advantages over the other two feature selection methods. Under this condition, the SVM with radial kernel performed better than the LDA when few training samples were available, while LDA performed better when a large number of training samples were available. Conclusions: None of the investigated feature selection-classifier combinations provided consistently superior performance under the studied conditions for different sample sizes and feature space distributions. In general, the SFFS method was comparable to the SFS method while PCA may have an advantage for Gaussian feature spaces with unequal covariance matrices. The performance of the SVM with radial kernel was better than, or comparable to, that of the SVM with polynomial kernel under most conditions studied.
引用
收藏
页码:907 / 920
页数:14
相关论文
共 50 条
  • [1] Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size
    Sahiner, B
    Chan, HP
    Petrick, N
    Wagner, RF
    Hadjiiski, L
    [J]. MEDICAL PHYSICS, 2000, 27 (07) : 1509 - 1522
  • [2] Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images
    Chu, Carlton
    Hsu, Ai-Ling
    Chou, Kun-Hsien
    Bandettini, Peter
    Lin, ChingPo
    [J]. NEUROIMAGE, 2012, 60 (01) : 59 - 70
  • [3] A comparative study on the effect of feature selection on classification accuracy
    Karabulut, Esra Mahsereci
    Ozel, Selma Ayse
    Ibrikci, Turgay
    [J]. FIRST WORLD CONFERENCE ON INNOVATION AND COMPUTER SCIENCES (INSODE 2011), 2012, 1 : 323 - 327
  • [4] The Effect of Noise and Sample Size on an Unsupervised Feature Selection Method for Manifold Learning
    Vellido, Alfredo
    Velazco, Jorge
    [J]. 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 522 - 527
  • [5] Feature extractions for small sample size classification problem
    Kuo, Bor-Chen
    Chang, Kuang-Yu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2007, 45 (03): : 756 - 764
  • [6] A Study of Feature Selection Approaches for Classification
    Banu, A. K. Shafreen
    Ganesh, S. Hari
    [J]. 2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
  • [7] Neighborhood based sample and feature selection for SVM classification learning
    He, Qiang
    Xie, Zongxia
    Hu, Qinghua
    Wu, Congxin
    [J]. NEUROCOMPUTING, 2011, 74 (10) : 1585 - 1594
  • [8] Logistic Localized Modeling of the Sample Space for Feature Selection and Classification
    Armanfard, Narges
    Reilly, James P.
    Komeili, Majid
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1396 - 1413
  • [9] SVD based feature selection and sample classification of proteomic data
    D'Addabbo, Annarita
    Papale, Massimo
    Di Paolo, Salvatore
    Magaldi, Simona
    Colella, Roberto
    d'Onofrio, Valentina
    Di Palma, Annamaria
    Ranieri, Elena
    Gesualdo, Loreto
    Ancona, Nicola
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS, 2008, 5179 : 556 - +
  • [10] The Effect of Feature Selection on Phish Website Detection An Empirical Study on Robust Feature Subset Selection for Effective Classification
    Zuhair, Hiba
    Selmat, Ali
    Salleh, Mazleena
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (10) : 221 - 232