Feature selection using neighborhood uncertainty measures and Fisher score for gene expression data classification

被引:2
|
作者
Xu, Jiucheng [1 ,2 ]
Qu, Kanglin [1 ,2 ]
Qu, Kangjian [3 ]
Hou, Qincheng [1 ,2 ]
Meng, Xiangru [1 ,2 ]
机构
[1] Henan Normal Univ, Coll Comp & Informat Engn, Xinxiang 453007, Peoples R China
[2] Engn Lab Intelligence Business & Internet Things, Xinxiang, Henan, Peoples R China
[3] Nanjing Inst Technol, Coll Comp Engn, Nanjing 210000, Peoples R China
基金
中国国家自然科学基金;
关键词
Gene selection; Neighborhood rough set; Uncertainty measures; Fisher score; ATTRIBUTE REDUCTION; ROUGH SETS; ALGORITHM; INFORMATION;
D O I
10.1007/s13042-023-01878-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classification of gene expression data provides a basis for the study of pathogenesis and treatment. However, this type of data is characterized by high dimensionality and small samples, which seriously affect the classification results. Consequently, it is necessary to use a gene selection algorithm to select key genes from gene expression data to improve the classification results, but the existing gene selection algorithm has the problems of low classification precision and high time complexity. Therefore, this paper proposes a gene selection algorithm using neighborhood uncertainty measures and Fisher score. First, to make full use of the information provided by the neighborhood decision system, the neighborhood fusion coverage and neighborhood fusion credibility are defined based on the neighborhood coverage and neighborhood credibility, and they are used to characterize neighborhood uncertainty measures. Second, the neighborhood uncertainty measures are extended by combining the algebraic and information theory views, and a heuristic nonmonotonic gene selection algorithm is designed based on the neighborhood uncertainty measures. The algorithm makes full use of the information in the neighborhood decision system to evaluate the importance of genes from the algebraic and information theory views, thereby selecting an optimal gene subset and improving classification precision. Third, Fisher score method is introduced into the proposed algorithm to preliminarily eliminate redundant genes to reduce the time cost of calculation and improve the performance of the algorithm. Finally, by comparing the experimental results of our algorithm with those of existing gene selection algorithms on ten gene datasets, it is proved that our algorithm can effectively improve the classification results for gene data.
引用
收藏
页码:4011 / 4028
页数:18
相关论文
共 50 条
  • [1] Feature selection using neighborhood uncertainty measures and Fisher score for gene expression data classification
    Jiucheng Xu
    Kanglin Qu
    Kangjian Qu
    Qincheng Hou
    Xiangru Meng
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 4011 - 4028
  • [2] Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification
    Sun, Lin
    Zhang, Xiaoyu
    Qian, Yuhua
    Xu, Jiucheng
    Zhang, Shiguang
    INFORMATION SCIENCES, 2019, 502 : 18 - 41
  • [3] Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification
    Sun, Lin
    Wang, Tianxiang
    Ding, Weiping
    Xu, Jiucheng
    Lin, Yaojin
    INFORMATION SCIENCES, 2021, 578 : 887 - 912
  • [4] Feature Selection in Microarray Gene Expression Data Using Fisher Discriminant Ratio
    Sarbazi-Azad, Saeed
    Abadeh, Mohammad Saniee
    Abadi, Mehdi Irannejad Najaf
    2018 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2018, : 225 - 230
  • [5] Feature Selection and Classification for Gene Expression Data using Evolutionary Computation
    Banka, Haider
    Dara, Suresh
    2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 185 - 189
  • [6] Joint neighborhood entropy-based gene selection method with fisher score for tumor classification
    Sun, Lin
    Zhang, Xiao-Yu
    Qian, Yu-Hua
    Xu, Jiu-Cheng
    Zhang, Shi-Guang
    Tian, Yun
    APPLIED INTELLIGENCE, 2019, 49 (04) : 1245 - 1259
  • [7] Joint neighborhood entropy-based gene selection method with fisher score for tumor classification
    Lin Sun
    Xiao-Yu Zhang
    Yu-Hua Qian
    Jiu-Cheng Xu
    Shi-Guang Zhang
    Yun Tian
    Applied Intelligence, 2019, 49 : 1245 - 1259
  • [8] Improving feature selection performance for classification of gene expression data using Harris Hawks optimizer with variable neighborhood learning
    Qu, Chiwen
    Zhang, Lupeng
    Li, Jinlong
    Deng, Fang
    Tang, Yifan
    Zeng, Xiaomin
    Peng, Xiaoning
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [9] Feature Selection and Classification in gene expression cancer data
    Pavithra, D.
    Lakshmanan, B.
    2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS), 2017,
  • [10] Mixed measure-based feature selection using the Fisher score and neighborhood rough sets
    Sun, Lin
    Zhang, Jiuxiao
    Ding, Weiping
    Xu, Jiucheng
    APPLIED INTELLIGENCE, 2022, 52 (15) : 17264 - 17288