Cascade of genetic algorithm and decision tree for cancer classification on gene expression data

被引:0
|
作者
Yeh, Jinn-Yi [1 ]
Wu, Tai-Hsi [2 ]
机构
[1] Natl Chiayi Univ, Dept Management Informat Syst, Chiayi 600, Taiwan
[2] Natl Taipei Univ, Dept Business Adm, Taipei 237, Taiwan
关键词
cancer classification; gene expression data; genetic algorithms; decision tree; SUPPORT VECTOR MACHINES; TUMOR CLASSIFICATION; MICROARRAY DATA; FEATURE-SELECTION; CLUSTER-ANALYSIS; VISUALIZATION; PREDICTION; DISCOVERY; SYSTEMS; CELL;
D O I
10.1111/j.1468-0394.2010.00522.x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer classification, through gene expression data analysis, has produced remarkable results, and has indicated that gene expression assays could significantly aid in the development of efficient cancer diagnosis and classification platforms. However, cancer classification, based on DNA array data, remains a difficult problem. The main challenge is the overwhelming number of genes relative to the number of training samples, which implies that there are a large number of irrelevant genes to be dealt with. Another challenge is from the presence of noise inherent in the data set. It makes accurate classification of data more difficult when the sample size is small. We apply genetic algorithms (GAs) with an initial solution provided by t statistics, called t-GA, for selecting a group of relevant genes from cancer microarray data. The decision-tree-based cancer classifier is built on the basis of these selected genes. The performance of this approach is evaluated by comparing it to other gene selection methods using publicly available gene expression data sets. Experimental results indicate that t-GA has the best performance among the different gene selection methods. The Z-score figure also shows that some genes are consistently preferentially chosen by t-GA in each data set.
引用
收藏
页码:201 / 218
页数:18
相关论文
共 50 条
  • [1] Analyzing Gene Expression Data: Fuzzy Decision Tree Algorithm applied to the Classification of Cancer Data
    Ludwig, Simone A.
    Jakobovic, Domagoj
    Picek, Stjepan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [2] Suite of decision tree-based classification algorithms on cancer gene expression data
    Al Snousy, Mohmad Badr
    El-Deeb, Hesham Mohamed
    Badran, Khaled
    Al Khlil, Ibrahim Ali
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2011, 12 (02) : 73 - 82
  • [3] A fuzzy decision tree approach to start a genetic algorithm for data classification
    Espíndola, RP
    Ebecken, NFF
    [J]. DATA MINING V: DATA MINING, TEXT MINING AND THEIR BUSINESS APPLICATIONS, 2004, 10 : 133 - 142
  • [4] Classification of epidemiological data: A comparison of genetic algorithm and decision tree approaches
    Congdon, CB
    [J]. PROCEEDINGS OF THE 2000 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2000, : 442 - 449
  • [5] A Hybrid Ensemble Algorithm Combining AdaBoost and Genetic Algorithm for Cancer Classification with Gene Expression Data
    Lu, Huijuan
    Gao, Huiyun
    Ye, Minchao
    Wang, Xiuhui
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (03) : 863 - 870
  • [6] A Hybrid Ensemble Algorithm Combining AdaBoost and Genetic Algorithm for Cancer Classification with Gene Expression Data
    Lu, Huijuan
    Gao, Huiyun
    Ye, Minchao
    Yan, Ke
    Wang, Xiuhui
    [J]. 2018 NINTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME 2018), 2018, : 15 - 19
  • [7] A genetic filter for cancer classification on gene expression data
    Kim, Yong-Hyuk
    Yoon, Yourim
    [J]. BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S1993 - S2002
  • [8] A Statistical Decision Tree Algorithm for Data Stream Classification
    Cazzolato, Mirela Teixeira
    Ribeiro, Marcela Xavier
    Yaguinuma, Cristiane
    Prado Santos, Marilde Terezinha
    [J]. ICEIS: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1, 2013, : 217 - 223
  • [9] Elegant decision tree algorithm for classification in data mining
    Chandra, B
    Mazumdar, S
    Arena, V
    Parimi, N
    [J]. WISE 2002: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING (WORKSHOPS), 2002, : 160 - 169
  • [10] Hybrid Adaboost based on Genetic Algorithm for Gene Expression Data Classification
    Meng, Yaqiong
    Lu, Huijuan
    Yan, Ke
    Ye, Minchao
    [J]. 12TH CHINESE CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING (CHINESECSCW 2017), 2017, : 257 - 258