FEATURES SELECTION USING PARAMETRIC AND NON-PARAMETRIC METHODS: TAG SNPs SELECTION USING GA-SVM AND GA-KNN

被引:1
|
作者
Elatraby, Amr I. A. [1 ]
Wahba, Rashad R. T. [1 ]
机构
[1] Ain Shams Univ, Fac Commerce, Stat Math & Insurance Dept, Cairo, Egypt
关键词
Single Nucleotide Polymorphisms (SNPs); tag SNPs; Support Vector Machine (SVM); K-Nearest Neighbor (KNN); Genetic Algorithm (GA);
D O I
10.17654/ADASMay2015_105_123
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The study of genetic variations of the human genome, especially Single Nucleotide Polymorphisms (SNPs), can lead to the discovery of new methods to prevent, diagnose and treat diseases. Full examination of all the SNPs of the human genome has become too expensive, thus a small subset of informative SNPs called tag SNPs must be selected. In this study, two methods for the selection of tag SNPs are presented. The first method is called GA-SVM, which integrates the Support Vector Machine (SVM) as a parametric technique with the Genetic Algorithm (GA). The second method is called GA-KNN, which integrates the K-Nearest Neighbor (KNN) as a non-parametric technique with GA. The two methods are tested on a group of genes, which known to be related to the natural clearance of Hepatitis C Virus (HCV). The genes' SNPs data had extracted from the HapMap site (http://hapmap.org). Moreover, the prediction accuracy of each method has been evaluated by using the 10-Fold Cross Validation (10-FCV) method. Our results have showed that, although the prediction accuracy of GA-SVM outperforms the prediction accuracy of GA-KNN when selecting a very small number of tag SNPs, the prediction accuracy of GA-KNN outperforms GA-SVM in all other cases. In addition, our results have indicated that the GA-KNN method requires more computing time as compared with GA-SVM.
引用
收藏
页码:105 / 123
页数:19
相关论文
共 50 条
  • [31] Linux Malware Detection using non-Parametric Statistical methods
    Asmitha, K. A.
    Vinod, P.
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 356 - 361
  • [32] SVM classifier incorporating feature selection using GA for spam detection
    Wang, HB
    Yu, Y
    Liu, Z
    EMBEDDED AND UBIQUITOUS COMPUTING - EUC 2005, 2005, 3824 : 1147 - 1154
  • [33] Variable selection using statistical non-parametric tests for classifying production batches into multiple classes
    Beuren, Gilberto Muller
    Anzanello, Michel Jose
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 193
  • [34] Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms
    Alba, Enrique
    Garcia-Nieto, Jose
    Jourdan, Laetitia
    Talbi, El-Ghazali
    2007 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-10, PROCEEDINGS, 2007, : 284 - +
  • [35] Predicting the lumber volume recovery of Picea mariana using parametric and non-parametric regression methods
    Zhang, S. Y.
    Liu, Chuangmin
    SCANDINAVIAN JOURNAL OF FOREST RESEARCH, 2006, 21 (02) : 158 - 166
  • [36] Tremor Detection Using Parametric and Non-Parametric Spectral Estimation Methods: A Comparison with Clinical Assessment
    Manzanera, Octavio Martinez
    Elting, Jan Willem
    van der Hoeven, Johannes H.
    Maurits, Natasha M.
    PLOS ONE, 2016, 11 (06):
  • [37] Quantifying epidemologic risk factors using non-parametric regression: model selection remains the greatest challenge
    Rosenberg, PS
    Katki, H
    Swanson, CA
    Brown, LM
    Wacholder, S
    Hoover, RN
    STATISTICS IN MEDICINE, 2003, 22 (21) : 3369 - 3381
  • [38] A semi-supervised feature selection method using a non-parametric technique with pairwise instance constraints
    Chen, Chien-Hsing
    JOURNAL OF INFORMATION SCIENCE, 2013, 39 (03) : 359 - 371
  • [39] A SURVEY OF SOFTWARE RELIABILITY GROWTH MODELS USING NON-PARAMETRIC METHODS
    Saley, M. K.
    Sreedharan, Sasikumaran
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 563 - 567
  • [40] Analysing Tax-Benefit Reforms Using Non-Parametric Methods
    Fiorio, Carlo V.
    FISCAL STUDIES, 2008, 29 (04) : 499 - 522