Feature selection using neighborhood uncertainty measures and Fisher score for gene expression data classification

被引：2

作者：

Xu, Jiucheng ^{[1
,2
]}

Qu, Kanglin ^{[1
,2
]}

Qu, Kangjian ^{[3
]}

Hou, Qincheng ^{[1
,2
]}

Meng, Xiangru ^{[1
,2
]}

机构：

[1] Henan Normal Univ, Coll Comp & Informat Engn, Xinxiang 453007, Peoples R China

[2] Engn Lab Intelligence Business & Internet Things, Xinxiang, Henan, Peoples R China

[3] Nanjing Inst Technol, Coll Comp Engn, Nanjing 210000, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2023年 / 14卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Gene selection; Neighborhood rough set; Uncertainty measures; Fisher score; ATTRIBUTE REDUCTION; ROUGH SETS; ALGORITHM; INFORMATION;

D O I：

10.1007/s13042-023-01878-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The classification of gene expression data provides a basis for the study of pathogenesis and treatment. However, this type of data is characterized by high dimensionality and small samples, which seriously affect the classification results. Consequently, it is necessary to use a gene selection algorithm to select key genes from gene expression data to improve the classification results, but the existing gene selection algorithm has the problems of low classification precision and high time complexity. Therefore, this paper proposes a gene selection algorithm using neighborhood uncertainty measures and Fisher score. First, to make full use of the information provided by the neighborhood decision system, the neighborhood fusion coverage and neighborhood fusion credibility are defined based on the neighborhood coverage and neighborhood credibility, and they are used to characterize neighborhood uncertainty measures. Second, the neighborhood uncertainty measures are extended by combining the algebraic and information theory views, and a heuristic nonmonotonic gene selection algorithm is designed based on the neighborhood uncertainty measures. The algorithm makes full use of the information in the neighborhood decision system to evaluate the importance of genes from the algebraic and information theory views, thereby selecting an optimal gene subset and improving classification precision. Third, Fisher score method is introduced into the proposed algorithm to preliminarily eliminate redundant genes to reduce the time cost of calculation and improve the performance of the algorithm. Finally, by comparing the experimental results of our algorithm with those of existing gene selection algorithms on ten gene datasets, it is proved that our algorithm can effectively improve the classification results for gene data.

引用

页码：4011 / 4028

页数：18

共 50 条

[1] Feature selection using neighborhood uncertainty measures and Fisher score for gene expression data classification
Jiucheng Xu
Kanglin Qu
Kangjian Qu
Qincheng Hou
Xiangru Meng
International Journal of Machine Learning and Cybernetics, 2023, 14 : 4011 - 4028
[2] Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification
Sun, Lin
Zhang, Xiaoyu
Qian, Yuhua
Xu, Jiucheng
Zhang, Shiguang
INFORMATION SCIENCES, 2019, 502 : 18 - 41
[3] Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification
Sun, Lin
Wang, Tianxiang
Ding, Weiping
Xu, Jiucheng
Lin, Yaojin
INFORMATION SCIENCES, 2021, 578 : 887 - 912
[4] Feature Selection in Microarray Gene Expression Data Using Fisher Discriminant Ratio
Sarbazi-Azad, Saeed
Abadeh, Mohammad Saniee
Abadi, Mehdi Irannejad Najaf
2018 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2018, : 225 - 230
[5] Feature Selection and Classification for Gene Expression Data using Evolutionary Computation
Banka, Haider
Dara, Suresh
2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 185 - 189
[6] Joint neighborhood entropy-based gene selection method with fisher score for tumor classification
Sun, Lin
Zhang, Xiao-Yu
Qian, Yu-Hua
Xu, Jiu-Cheng
Zhang, Shi-Guang
Tian, Yun
APPLIED INTELLIGENCE, 2019, 49 (04) : 1245 - 1259
[7] Joint neighborhood entropy-based gene selection method with fisher score for tumor classification
Lin Sun
Xiao-Yu Zhang
Yu-Hua Qian
Jiu-Cheng Xu
Shi-Guang Zhang
Yun Tian
Applied Intelligence, 2019, 49 : 1245 - 1259
[8] Improving feature selection performance for classification of gene expression data using Harris Hawks optimizer with variable neighborhood learning
Qu, Chiwen
Zhang, Lupeng
Li, Jinlong
Deng, Fang
Tang, Yifan
Zeng, Xiaomin
Peng, Xiaoning
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
[9] Feature Selection and Classification in gene expression cancer data
Pavithra, D.
Lakshmanan, B.
2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS), 2017,
[10] Mixed measure-based feature selection using the Fisher score and neighborhood rough sets
Sun, Lin
Zhang, Jiuxiao
Ding, Weiping
Xu, Jiucheng
APPLIED INTELLIGENCE, 2022, 52 (15) : 17264 - 17288

← 1 2 3 4 5 →