A filter feature selection for high-dimensional data

被引：6

作者：

Janane, Fatima Zahra ^{[1
]}

Ouaderhman, Tayeb ^{[1
]}

Chamlal, Hasna ^{[1
]}

机构：

[1] Hassan II Univ, Fac Sci Ain Chock, Dept Math & Informat, Fundamental & Appl Math Lab, Km 8 Route El Jadida,BP 5366 Maarif, Casablanca 20100, Morocco

来源：

JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY | 2023年 / 17卷

关键词：

Relief; Technique for Order Preference by Similarity to Ideal Solution; feature selection; high-dimensional data; feature ranking; CLASSIFICATION; ALGORITHMS; RELIEFF;

D O I：

10.1177/17483026231184171

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In a classification problem, before building a prediction model, it is very important to identify informative features rather than using tens or thousands which may penalize some learning methods and increase the risk of over-fitting. To overcome these problems, the best solution is to use feature selection. In this article, we propose a new filter method for feature selection, by combining the Relief filter algorithm and the multi-criteria decision-making method called TOPSIS (Technique for Order Preference by Similarity to Ideal Solution), we modeled the feature selection task as a multi-criteria decision problem. Exploiting the Relief methodology, a decision matrix is computed and delivered to Technique for Order Preference by Similarity to Ideal Solution in order to rank the features. The proposed method ends up giving a ranking to the features from the best to the mediocre. To evaluate the performances of the suggested approach, a simulation study including a set of experiments and case studies was conducted on three synthetic dataset scenarios. Finally, the obtained results approve the effectiveness of our proposed filter to detect the best informative features.

引用

页数：14

共 50 条

[1] Filter Feature Selection Performance Comparison in High-dimensional Data
Huertas, Carlos
Juarez-Ramirez, Reyes
[J]. 2014 17TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2014,
[2] Benchmark for filter methods for feature selection in high-dimensional classification data
Bommert, Andrea
Sun, Xudong
Bischl, Bernd
Rahnenfuehrer, Joerg
Lang, Michel
[J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
[3] Feature selection for high-dimensional data
Destrero A.
Mosci S.
De Mol C.
Verri A.
Odone F.
[J]. Computational Management Science, 2009, 6 (1) : 25 - 40
[4] Feature selection for high-dimensional data
Bolón-Canedo V.
Sánchez-Maroño N.
Alonso-Betanzos A.
[J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
[5] Feature selection for high-dimensional data in astronomy
Zheng, Hongwen
Zhang, Yanxia
[J]. ADVANCES IN SPACE RESEARCH, 2008, 41 (12) : 1960 - 1964
[6] Feature selection for high-dimensional imbalanced data
Yin, Liuzhi
Ge, Yong
Xiao, Keli
Wang, Xuehua
Quan, Xiaojun
[J]. NEUROCOMPUTING, 2013, 105 : 3 - 11
[7] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
Verleysen, Michel
[J]. NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
[8] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
Verleysen, Michel
[J]. ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
[9] Feature selection for high-dimensional temporal data
Michail Tsagris
Vincenzo Lagani
Ioannis Tsamardinos
[J]. BMC Bioinformatics, 19
[10] Feature Selection with High-Dimensional Imbalanced Data
Van Hulse, Jason
Khoshgoftaar, Taghi M.
Napolitano, Amri
Wald, Randall
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 507 - 514

← 1 2 3 4 5 →