Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data

被引:49
|
作者
Tong, Dong Ling [1 ]
Schierz, Amanda C. [2 ]
机构
[1] Nottingham Trent Univ, Sch Sci & Technol, John van Geest Canc Res Ctr, Nottingham NG11 8NS, England
[2] Bournemouth Univ, Sch Design Engn & Comp, Poole BH12 5BB, Dorset, England
关键词
Genetic algorithm; Artificial neural network; Feature extraction; Unpreprocessed microarray data; Cancer marker genes; MULTICLASS CANCER CLASSIFICATION; EXPRESSION DATA; FEATURE-SELECTION; MOLECULAR CLASSIFICATION; BIOMARKER DISCOVERY; DNA; PREDICTION; DIAGNOSIS; ENSEMBLE; MACHINE;
D O I
10.1016/j.artmed.2011.06.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: Suitable techniques for microarray analysis have been widely researched, particularly for the study of marker genes expressed to a specific type of cancer. Most of the machine learning methods that have been applied to significant gene selection focus on the classification ability rather than the selection ability of the method. These methods also require the microarray data to be preprocessed before analysis takes place. The objective of this study is to develop a hybrid genetic algorithm-neural network (GANN) model that emphasises feature selection and can operate on unpreprocessed microarray data. Method: The GANN is a hybrid model where the fitness value of the genetic algorithm (GA) is based upon the number of samples correctly labelled by a standard feedforward artificial neural network (ANN). The model is evaluated by using two benchmark microarray datasets with different array platforms and differing number of classes (a 2-class oligonucleotide microarray data for acute leukaemia and a 4-class complementary DNA (cDNA) microarray dataset for SRBCTs (small round blue cell tumours)). The underlying concept of the GANN algorithm is to select highly informative genes by co-evolving both the GA fitness function and the ANN weights at the same time. Results: The novel GANN selected approximately 50% of the same genes as the original studies. This may indicate that these common genes are more biologically significant than other genes in the datasets. The remaining 50% of the significant genes identified were used to build predictive models and for both datasets, the models based on the set of genes extracted by the GANN method produced more accurate results. The results also suggest that the GANN method not only can detect genes that are exclusively associated with a single cancer type but can also explore the genes that are differentially expressed in multiple cancer types. Conclusions: The results show that the GANN model has successfully extracted statistically significant genes from the unpreprocessed microarray data as well as extracting known biologically significant genes. We also show that assessing the biological significance of genes based on classification accuracy may be misleading and though the GANN's set of extra genes prove to be more statistically significant than those selected by other methods, a biological assessment of these genes is highly recommended to confirm their functionality. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:47 / 56
页数:10
相关论文
共 50 条
  • [1] A hybrid genetic algorithm-neural network strategy for simulation optimization
    Wang, L
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2005, 170 (02) : 1329 - 1343
  • [2] Microwave breast tumor localization using wavelet feature extraction and genetic algorithm-neural network
    Lu, Min
    Xiao, Xia
    Liu, Guancong
    Lu, Hong
    [J]. MEDICAL PHYSICS, 2021, 48 (10) : 6080 - 6093
  • [3] Mining Medical Opinions Using Hybrid Genetic Algorithm-Neural Network
    Jotheeswaran, Jeevanandam
    Koteeswaran, S.
    [J]. JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2016, 6 (08) : 1925 - 1928
  • [4] Estimation of groundwater level using a hybrid genetic algorithm-neural network
    Hosseini, Z.
    Nakhaei, M.
    [J]. POLLUTION, 2015, 1 (01): : 9 - 21
  • [5] Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection
    Tong, Dong Ling
    Mintram, Robert
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2010, 1 (1-4) : 75 - 87
  • [6] Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection
    Dong Ling Tong
    Robert Mintram
    [J]. International Journal of Machine Learning and Cybernetics, 2010, 1 : 75 - 87
  • [7] A HYBRID GENETIC ALGORITHM-NEURAL NETWORK APPROACH FOR PRICING CORES AND REMANUFACTURED CORES
    Seidi, M.
    Kimiagari, A. M.
    [J]. SOUTH AFRICAN JOURNAL OF INDUSTRIAL ENGINEERING, 2010, 21 (02): : 131 - 148
  • [8] Application of genetic algorithm-neural network for the correction of bad data in power system
    Zou Xian
    Han Wu
    Sheng Siqing
    Zhang Shaoquan
    [J]. 2011 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND CONTROL (ICECC), 2011, : 1894 - 1897
  • [9] A Hybrid Artifical Neural Network-Genetic Algorithm Approach for Classification of Microarray Data
    Bilen, Mehmet
    Isik, Ali Hakan
    Yigit, Tuncay
    [J]. 2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 339 - 342
  • [10] A hybrid neural network/genetic algorithm approach to optimizing feature extraction for signal classification
    Rovithakis, GA
    Maniadakis, M
    Zervakis, M
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2004, 34 (01): : 695 - 702