Effect of finite sample size on feature selection and classification: A simulation study

被引：49

作者：

Way, Ted W. ^{[1
]}

Sahiner, Berkman ^{[1
]}

Hadjiiski, Lubomir M. ^{[1
]}

Chan, Heang-Ping ^{[1
]}

机构：

[1] Univ Michigan, Dept Radiol, Ann Arbor, MI 48109 USA

来源：

MEDICAL PHYSICS | 2010年 / 37卷 / 02期

关键词：

feature extraction; Gaussian distribution; image classification; medical image processing; principal component analysis; support vector machines; COMPUTER-AIDED DIAGNOSIS; SUPPORT VECTOR MACHINES; NEURAL-NETWORK CLASSIFIERS; PATTERN-RECOGNITION; CHEST RADIOGRAPHS; PERFORMANCE; SCHEMES;

D O I：

10.1118/1.3284974

中图分类号：

R8 [特种医学]; R445 [影像诊断学];

学科分类号：

1002 ; 100207 ; 1009 ;

摘要：

Methods: Three feature selection techniques, the stepwise feature selection (SFS), sequential floating forward search (SFFS), and principal component analysis (PCA), and two commonly used classifiers, Fisher's linear discriminant analysis (LDA) and support vector machine (SVM), were investigated. Samples were drawn from multidimensional feature spaces of multivariate Gaussian distributions with equal or unequal covariance matrices and unequal means, and with equal covariance matrices and unequal means estimated from a clinical data set. Classifier performance was quantified by the area under the receiver operating characteristic curve A(z). The mean A(z) values obtained by resubstitution and hold-out methods were evaluated for training sample sizes ranging from 15 to 100 per class. The number of simulated features available for selection was chosen to be 50, 100, and 200. Results: It was found that the relative performance of the different combinations of classifier and feature selection method depends on the feature space distributions, the dimensionality, and the available training sample sizes. The LDA and SVM with radial kernel performed similarly for most of the conditions evaluated in this study, although the SVM classifier showed a slightly higher hold-out performance than LDA for some conditions and vice versa for other conditions. PCA was comparable to or better than SFS and SFFS for LDA at small samples sizes, but inferior for SVM with polynomial kernel. For the class distributions simulated from clinical data, PCA did not show advantages over the other two feature selection methods. Under this condition, the SVM with radial kernel performed better than the LDA when few training samples were available, while LDA performed better when a large number of training samples were available. Conclusions: None of the investigated feature selection-classifier combinations provided consistently superior performance under the studied conditions for different sample sizes and feature space distributions. In general, the SFFS method was comparable to the SFS method while PCA may have an advantage for Gaussian feature spaces with unequal covariance matrices. The performance of the SVM with radial kernel was better than, or comparable to, that of the SVM with polynomial kernel under most conditions studied.

引用

页码：907 / 920

页数：14

共 50 条

[1] Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size
Sahiner, B
Chan, HP
Petrick, N
Wagner, RF
Hadjiiski, L
[J]. MEDICAL PHYSICS, 2000, 27 (07) : 1509 - 1522
[2] Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images
Chu, Carlton
Hsu, Ai-Ling
Chou, Kun-Hsien
Bandettini, Peter
Lin, ChingPo
[J]. NEUROIMAGE, 2012, 60 (01) : 59 - 70
[3] A comparative study on the effect of feature selection on classification accuracy
Karabulut, Esra Mahsereci
Ozel, Selma Ayse
Ibrikci, Turgay
[J]. FIRST WORLD CONFERENCE ON INNOVATION AND COMPUTER SCIENCES (INSODE 2011), 2012, 1 : 323 - 327
[4] The Effect of Noise and Sample Size on an Unsupervised Feature Selection Method for Manifold Learning
Vellido, Alfredo
Velazco, Jorge
[J]. 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 522 - 527
[5] Feature extractions for small sample size classification problem
Kuo, Bor-Chen
Chang, Kuang-Yu
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2007, 45 (03): : 756 - 764
[6] A Study of Feature Selection Approaches for Classification
Banu, A. K. Shafreen
Ganesh, S. Hari
[J]. 2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
[7] Neighborhood based sample and feature selection for SVM classification learning
He, Qiang
Xie, Zongxia
Hu, Qinghua
Wu, Congxin
[J]. NEUROCOMPUTING, 2011, 74 (10) : 1585 - 1594
[8] Logistic Localized Modeling of the Sample Space for Feature Selection and Classification
Armanfard, Narges
Reilly, James P.
Komeili, Majid
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1396 - 1413
[9] SVD based feature selection and sample classification of proteomic data
D'Addabbo, Annarita
Papale, Massimo
Di Paolo, Salvatore
Magaldi, Simona
Colella, Roberto
d'Onofrio, Valentina
Di Palma, Annamaria
Ranieri, Elena
Gesualdo, Loreto
Ancona, Nicola
[J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS, 2008, 5179 : 556 - +
[10] The Effect of Feature Selection on Phish Website Detection An Empirical Study on Robust Feature Subset Selection for Effective Classification
Zuhair, Hiba
Selmat, Ali
Salleh, Mazleena
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (10) : 221 - 232

← 1 2 3 4 5 →