A combination of fuzzy similarity measures and fuzzy entropy measures for supervised feature selection

被引:33
|
作者
Lohrmann, Christoph [1 ,2 ]
Luukka, Pasi [2 ]
Jablonska-Sabuka, Matylda [1 ]
Kauranne, Tuomo [1 ]
机构
[1] Lappeenranta Univ Technol, Sch Engn Sci, Skinnarilankatu 34, Lappeenranta 53850, Finland
[2] Lappeenranta Univ Technol, Sch Business & Management, Skinnarilankatu 34, Lappeenranta 53850, Finland
关键词
Feature ranking; Filter method; Wrapper method; Machine learning; ReliefF; CLASSIFIER;
D O I
10.1016/j.eswa.2018.06.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large amounts of information and various features are in many machine learning applications available, or easily obtainable. However, their quality is potentially low and greater volumes of information are not always beneficial for machine learning, for instance, when not all available features in a data set are relevant for the classification task and for understanding the studied phenomenon. Feature selection aims at determining a subset of features that represents the data well, gives accurate classification results and reduces the impact of noise on the classification performance. In this paper, we propose a filter feature ranking method for feature selection based on fuzzy similarity and entropy measures (FSAE), which is an adaptation of the idea used for the wrapper function by Luukka (2011) and has an additional scaling factor. The scaling factor to the feature and class-specific entropy values that is implemented, accounts for the distance between the ideal vectors for each class. Moreover, a wrapper version of the FSAE with a similarity classifier is presented as well. The feature selection method is tested on five medical data sets: dermatology, chronic kidney disease, breast cancer, diabetic retinopathy and horse colic. The wrapper version of FSAE is compared to the wrapper introduced by Luukka (2011) and shows at least as accurate results with often considerably fewer features. In the comparison with ReliefF, Laplacian score, Fisher score and the filter version of Luukka (2011), the FSAE filter in general achieves competitive mean accuracies and results for one medical data set, the breast cancer Wisconsin data set, together with the Laplacian score in the best results over all possible feature removals. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:216 / 236
页数:21
相关论文
共 50 条
  • [41] On Generalized Measures of Entropy for Fuzzy Sets
    Arora, Priya
    Tomar, V. P.
    PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 385 - 395
  • [42] Uncertainty measures and feature selection based on composite entropy for generalized multigranulation fuzzy neighborhood rough set
    Zhang, Xiaoyan
    Zhao, Weicheng
    FUZZY SETS AND SYSTEMS, 2024, 486
  • [43] On the Creation of a Fuzzy Dataset for the Evaluation of Fuzzy Semantic Similarity Measures
    Chandran, David
    Crockett, Keeley
    Mclean, David
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 752 - 759
  • [44] Intuitionistic fuzzy value similarity measures for intuitionistic fuzzy sets
    Zichun Chen
    Penghui Liu
    Computational and Applied Mathematics, 2022, 41
  • [45] Intuitionistic fuzzy value similarity measures for intuitionistic fuzzy sets
    Chen, Zichun
    Liu, Penghui
    COMPUTATIONAL & APPLIED MATHEMATICS, 2022, 41 (01):
  • [46] A Parametric Family of Fuzzy Similarity Measures for Intuitionistic Fuzzy Sets
    Qayyum, Madiha
    Kerre, Etienne E.
    Ashraf, Samina
    MATHEMATICS, 2023, 11 (14)
  • [47] Similarity Measures of Sequence of Fuzzy Numbers and Fuzzy Risk Analysis
    Zararsiz, Zarife
    ADVANCES IN MATHEMATICAL PHYSICS, 2015, 2015
  • [48] Monotonic similarity measures between fuzzy sets and their relationship with entropy and inclusion measure
    Deng, Guannan
    Jiang, Yanli
    Fu, Jingchao
    FUZZY SETS AND SYSTEMS, 2016, 287 : 97 - 118
  • [49] Generalised interval-valued intuitionistic fuzzy entropy with some similarity measures
    Tiwari, Pratiksha
    Gupta, Priti
    INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS, 2019, 10 (05) : 488 - 512
  • [50] Using Fuzzy Set Similarity in Sentence Similarity Measures
    Cross, Valerie
    Mokrenko, Valeria
    Crockett, Keeley
    Adel, Naeemeh
    2020 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2020,