A Cross-Entropy Based Feature Selection Method for Binary Valued Data Classification

被引:1
|
作者
Wang, Zhipeng [1 ]
Zhu, Qiuming [1 ]
机构
[1] Univ Nebraska, Coll Informat Sci & Technol, Dept Comp Sci, Omaha, NE 68182 USA
关键词
Binary features; Feature selection; Cross entropy; Classification; Model verification;
D O I
10.1007/978-3-030-96308-8_130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is a process of finding a meaningful subset of attributes from a given set of measurements for a purpose of revealing a coherent relation or causality in an event to facilitate an effective pattern classification. It can be treated as a pre-step before constructing the machine learning models in big data analytics for improving the accuracy of a prediction result. By selecting the most significant features, it will reduce the time of training and the complexity of the model, avoid data overfitting, and help user to better understand the source data and the modeling results. Though features are commonly dealt with in continuous values, many features appear to be binary valued, i.e., either 1 or 0, in many real-world machine learning applications. Inspired by existing feature selection methods, a new framework called FMC_SELECTOR was presented in this paper which addresses specifically the selection of significant features of binary values from highly imbalanced large datasets. The FMC_SELECTOR combines the fisher linear discriminant analysis with a cross-entropy concept to create an integrated mapping function to evaluate each individual features from a given dataset. A new formula called Mapping Based Cross-Entropy Evaluation (MCE) was derived. A Positive Case Prediction Score (PPS) is explored to verify the significance of the features selected in a classification process. The performance of FMC_SELECTOR is compared with two popular feature selection methods - the Univariate Importance (UI) and Recursive Feature Elimination (RFM), and shows a better performance on the datasets tested.
引用
收藏
页码:1406 / 1416
页数:11
相关论文
共 50 条
  • [1] Hybrid Binary Bat Algorithm with Cross-Entropy Method for Feature Selection
    Li, Guocheng
    Le, Chengyi
    2019 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS ENGINEERING (ICCRE), 2019, : 165 - 169
  • [2] Beta Distribution-Based Cross-Entropy for Feature Selection
    Dai, Weixing
    Guo, Dianjing
    ENTROPY, 2019, 21 (08)
  • [3] A Fast Selection Based on Similar Cross-Entropy for Steganalytic Feature
    Jin, Ruixia
    Yu, Xinquan
    Ma, Yuanyuan
    Yin, Shuang
    Xu, Lige
    SYMMETRY-BASEL, 2021, 13 (09):
  • [4] The Feature Selection Method Based on a Probabilistic Approach and a Cross-Entropy Metric for the Image Recognition Problem
    Dubnova, Yu A.
    SCIENTIFIC AND TECHNICAL INFORMATION PROCESSING, 2021, 48 (06) : 430 - 435
  • [5] The Feature Selection Method Based on a Probabilistic Approach and a Cross-Entropy Metric for the Image Recognition Problem
    Yu. A. Dubnov
    Scientific and Technical Information Processing, 2021, 48 : 430 - 435
  • [6] Portfolio selection based on fuzzy cross-entropy
    Qin, Zhongfeng
    Li, Xiang
    Ji, Xiaoyu
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2009, 228 (01) : 139 - 149
  • [7] Binary cross-entropy with dynamical clipping
    Petr Hurtik
    Stefania Tomasiello
    Jan Hula
    David Hynar
    Neural Computing and Applications, 2022, 34 : 12029 - 12041
  • [8] The Cross-Entropy Based Multi-Filter Ensemble Method for Gene Selection
    Sun, Yingqiang
    Lu, Chengbo
    Li, Xiaobo
    GENES, 2018, 9 (05):
  • [9] Binary cross-entropy with dynamical clipping
    Hurtik, Petr
    Tomasiello, Stefania
    Hula, Jan
    Hynar, David
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (14): : 12029 - 12041
  • [10] Partial Iterative Decoding for Binary Turbo Codes via Cross-Entropy Based Bit Selection
    Wu, Jinhong
    Wang, Zhengdao
    Vojcic, Branimir R.
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2009, 57 (11) : 3298 - 3306