Serial filter-wrapper feature selection method with elite guided mutation strategy on cancer gene expression data

被引:0
|
作者
Song, Yu-Wei [1 ]
Wang, Jie-Sheng [1 ]
Qi, Yu-Liang [1 ]
Wang, Yu-Cai [1 ]
Song, Hao-Ming [1 ]
Shang-Guan, Yi-Peng [1 ]
机构
[1] School of Electronic and Information Engineering, University of Science and Technology Liaoning, Liaoning, Anshan, China
关键词
Feature selection; Cancer gene expression; Equilibrium optimizer; Parallel filter methods; Elite guided mutation strategies; Serial hybrid frameworks;
D O I
10.1007/s10462-024-11029-1
中图分类号
学科分类号
摘要
Nowadays, many researchers utilize cancer gene expression data to solve the problem of cancer subtype diagnosis, but cancer gene expression data are often high-dimensional, multi-sample, and multi-classified, so a hybrid serial filter-wrapper feature selection (FS) method based on elite guided mutation strategy for cancer gene expression data is proposed. It is divided into a preliminary screening phase and a combined modeling phase. In the preliminary screening stage, the threshold values of seven filter methods are determined by the leave-one cross-validation method, and the features selected by these seven filter methods are combined to form two subsets by using the thoughts of ‘‘And’’ and ‘‘Or’’ in the logical operation. The union subset of two subsets is used in the equilibrium optimizer (EO) in the subsequent combination model stage as the reserved subset in the preliminary screening stage. The resulting hybrid framework is connected by a parallel filter method designed in the first stage with an improved EO in the second stage, which is named as SFEMEO. In order to prove the effectiveness and generalization of the proposed SFEMEO, it is compared with other 9 basic algorithms on 10 UCI data sets. It is found that the classification accuracy of the SFEMEO is improved by 0.56% ~ 20.19%, and the optimal fitness is also greatly improved. After comparing SFEMEO with other nine intelligent optimization algorithms on ten cancer gene expression data sets, it can be found that compared with most algorithms, the accuracy rate is improved by 3.73% ~ 18.13%, and the optimal fitness is relatively superior. At the same time, Wilcoxon rank sum test was used to evaluate the results of intelligent optimization algorithms such as SFEMEO, which proved the effectiveness of the proposed hybrid framework and its superiority in solving the FS problem of high-dimensional cancer gene expression data. © The Author(s) 2025.
引用
收藏
相关论文
共 50 条
  • [31] Null space based feature selection method for gene expression data
    Sharma, Alok
    Imoto, Seiya
    Miyano, Satoru
    Sharma, Vandana
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2012, 3 (04) : 269 - 276
  • [32] Mixture feature selection strategy applied in cancer classification from gene expression
    Jin, Xing
    Deng, Yufeng
    Zhong, yixin
    2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4807 - 4809
  • [33] CGUFS: A clustering-guided unsupervised feature selection algorithm for gene expression data
    Xu, Zhaozhao
    Yang, Fangyuan
    Wang, Hong
    Sun, Junding
    Zhu, Hengde
    Wang, Shuihua
    Zhang, Yudong
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (09)
  • [34] Filter vs. Wrapper approach for optimum gene selection of high dimensional gene expression dataset: An analysis with cancer datasets
    Srivastava, Bhavna
    Jangid, Mahesh
    Srivastava, Rajeev
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
  • [35] A Novel Feature Selection Method for Gene Expression Data Based on Samples Localization
    Sheng, Mingyue
    Du, Wei
    Tian, Yuan
    Liang, Yanchun
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON BIOLOGICAL ENGINEERING AND PHARMACY (BEP 2016), 2016, 3 : 63 - 68
  • [36] An Integrated Feature Selection Algorithm for Cancer Classification using Gene Expression Data
    Ahmed, Saeed
    Kabir, Muhammad
    Ali, Zakir
    Arif, Muhammad
    Ali, Farman
    Yu, Dong-Jun
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2018, 21 (09) : 631 - 645
  • [37] Benchmark of filter methods for feature selection in high-dimensional gene expression survival data
    Bommert, Andrea
    Welchowski, Thomas
    Schmid, Matthias
    Rahnenfuehrer, Joerg
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [38] A novel feature selection method for classifying cancer subtype with centroid of gene expression
    Cho, J
    Lee, D
    Park, J
    Jung, J
    Lee, I
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VIII, PROCEEDINGS, 2003, : 7 - 11
  • [39] A Filter Feature Selection Method Based on MFA Score and Redundancy Excluding and It’s Application to Tumor Gene Expression Data Analysis
    Jiangeng Li
    Lei Su
    Zenan Pang
    Interdisciplinary Sciences: Computational Life Sciences, 2015, 7 : 391 - 396
  • [40] A Filter Feature Selection Method Based on MFA Score and Redundancy Excluding and It's Application to Tumor Gene Expression Data Analysis
    Li, Jiangeng
    Su, Lei
    Pang, Zenan
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2015, 7 (04) : 391 - 396