FEATURES SELECTION USING PARAMETRIC AND NON-PARAMETRIC METHODS: TAG SNPs SELECTION USING GA-SVM AND GA-KNN

被引:1
|
作者
Elatraby, Amr I. A. [1 ]
Wahba, Rashad R. T. [1 ]
机构
[1] Ain Shams Univ, Fac Commerce, Stat Math & Insurance Dept, Cairo, Egypt
关键词
Single Nucleotide Polymorphisms (SNPs); tag SNPs; Support Vector Machine (SVM); K-Nearest Neighbor (KNN); Genetic Algorithm (GA);
D O I
10.17654/ADASMay2015_105_123
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The study of genetic variations of the human genome, especially Single Nucleotide Polymorphisms (SNPs), can lead to the discovery of new methods to prevent, diagnose and treat diseases. Full examination of all the SNPs of the human genome has become too expensive, thus a small subset of informative SNPs called tag SNPs must be selected. In this study, two methods for the selection of tag SNPs are presented. The first method is called GA-SVM, which integrates the Support Vector Machine (SVM) as a parametric technique with the Genetic Algorithm (GA). The second method is called GA-KNN, which integrates the K-Nearest Neighbor (KNN) as a non-parametric technique with GA. The two methods are tested on a group of genes, which known to be related to the natural clearance of Hepatitis C Virus (HCV). The genes' SNPs data had extracted from the HapMap site (http://hapmap.org). Moreover, the prediction accuracy of each method has been evaluated by using the 10-Fold Cross Validation (10-FCV) method. Our results have showed that, although the prediction accuracy of GA-SVM outperforms the prediction accuracy of GA-KNN when selecting a very small number of tag SNPs, the prediction accuracy of GA-KNN outperforms GA-SVM in all other cases. In addition, our results have indicated that the GA-KNN method requires more computing time as compared with GA-SVM.
引用
收藏
页码:105 / 123
页数:19
相关论文
共 50 条
  • [1] Enhancing Autism Disease Classification Using a Hybrid GA-KNN Approach for Feature Selection
    Yousef, Maria
    Al Shehab, Laith
    Ghani, Doaa Abdel
    Alazzam, Hadeel
    Ghatasheh, Mohammad
    2024 15TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS, ICICS 2024, 2024,
  • [2] Automatic selection of compiler options using non-parametric inferential statistics
    Haneda, M
    Knijnenburg, PMW
    Wijshoff, HAG
    PACT 2005: 14TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2005, : 123 - 132
  • [3] Optimal parameters selection for non-parametric image registration methods
    Larrey-Ruiz, Jorge
    Morales-Sanchez, Juan
    ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PROCEEDINGS, 2006, 4179 : 564 - 575
  • [4] USING SALIENT FEATURES TO REINFORCE GA-SVM FOR BUSINESS CRISIS DIAGNOSES
    Chen, Liang-Hsuan
    Hsiao, Huey-Der
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (10): : 4487 - 4501
  • [5] Valuation of an option using non-parametric methods
    Shu Ling Chiang
    Ming Shann Tsai
    Review of Derivatives Research, 2019, 22 : 419 - 447
  • [6] Valuation of an option using non-parametric methods
    Chiang, Shu Ling
    Tsai, Ming Shann
    REVIEW OF DERIVATIVES RESEARCH, 2019, 22 (03) : 419 - 447
  • [8] MODELLING HAZARD OF BECOMING ALCOHOLIC USING PARAMETRIC AND NON-PARAMETRIC METHODS
    Muriuki, George Mwangi
    Mutiso, John M.
    Kosgei, Mathew K.
    INTERNATIONAL JOURNAL OF AGRICULTURAL AND STATISTICAL SCIENCES, 2021, 17 (02): : 545 - 556
  • [9] Evaluation of maize hybrids stability using parametric and non-parametric methods
    Bujak, Henryk
    Nowosad, Kamila
    Warzecha, Roman
    MAYDICA, 2014, 59 (1-4): : 170 - 175
  • [10] Acute Hypotension Episode Prediction Using Information Divergence for Feature Selection, and Non-Parametric Methods for Classification
    Fournier, P. A.
    Roy, J. F.
    CINC: 2009 36TH ANNUAL COMPUTERS IN CARDIOLOGY CONFERENCE, 2009, 36 : 625 - 628