A novel discretization algorithm based on multi-scale and information entropy

被引:0
|
作者
Yaling Xun
Qingxia Yin
Jifu Zhang
Haifeng Yang
Xiaohui Cui
机构
[1] Taiyuan University of Science and Technology (TYUST),
来源
Applied Intelligence | 2021年 / 51卷
关键词
Data mining; Discretization; Information entropy; Multi-scale; MDLPC criterion;
D O I
暂无
中图分类号
学科分类号
摘要
Discretization is one of the data preprocessing topics in the field of data mining, and is a critical issue to improve the efficiency and quality of data mining. Multi-scale can reveal the structure and hierarchical characteristics of data objects, the representation of the data in different granularities will be obtained if we make a reasonable hierarchical division for a research object. The multi-scale theory is introduced into the process of data discretization and a data discretization method based on multi-scale and information entropy called MSE is proposed. MSE first conducts scale partition on the domain attribute to obtain candidate cut point set with different granularity. Then, the information entropy is applied to the candidate cut point set, and the candidate cut point with the minimum information entropy is selected and detected in turn to determine the final cut point set using the MDLPC criterion. In such way, MSE avoids the problem that the candidate cut points are limited to only certain limited attribute values caused by considering only the statistical attribute values in the traditional discretization methods, and reduces the number of candidates by controlling the data division hierarchy to an optimal range. Finally, the extensive experiments show that MSE achieves high performance in terms of discretization efficiency and classification accuracy, especially when it is applied to support vector machines, random forest, and decision trees.
引用
收藏
页码:991 / 1009
页数:18
相关论文
共 50 条
  • [1] A novel discretization algorithm based on multi-scale and information entropy
    Xun, Yaling
    Yin, Qingxia
    Zhang, Jifu
    Yang, Haifeng
    Cui, Xiaohui
    APPLIED INTELLIGENCE, 2021, 51 (02) : 991 - 1009
  • [2] Fault Diagnosis for RPMs Based on Novel Weighted Multi-Scale Fractional Permutation Entropy Improved by Multi-Scale Algorithm and PSO
    Sun, Yongkui
    Cao, Yuan
    Li, Peng
    Su, Shuai
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (08) : 11072 - 11081
  • [3] The Systematic Bias of Entropy Calculation in the Multi-Scale Entropy Algorithm
    Lu, Jue
    Wang, Ze
    ENTROPY, 2021, 23 (06)
  • [4] Radar Emitter Signal Identification Based on Multi-scale Information Entropy
    Huang Yingkun
    Jin Weidong
    Ge Peng
    Li Bing
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2019, 41 (05) : 1084 - 1091
  • [5] Entropy based optimal scale combination selection for generalized multi-scale information tables
    Bao, Han
    Wu, Wei-Zhi
    Zheng, Jia-Wen
    Li, Tong-Jun
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (05) : 1427 - 1437
  • [6] Entropy based optimal scale combination selection for generalized multi-scale information tables
    Han Bao
    Wei-Zhi Wu
    Jia-Wen Zheng
    Tong-Jun Li
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 1427 - 1437
  • [7] A two-stage discretization algorithm based on information entropy
    Wen, Liu-Ying
    Min, Fan
    Wang, Shi-Yuan
    APPLIED INTELLIGENCE, 2017, 47 (04) : 1169 - 1185
  • [8] A two-stage discretization algorithm based on information entropy
    Liu-Ying Wen
    Fan Min
    Shi-Yuan Wang
    Applied Intelligence, 2017, 47 : 1169 - 1185
  • [9] An attribute discretization algorithm based on Rough Set and information entropy
    Liu, He
    Liu, Da-You
    Shi, Xiao-Hu
    Gao, Ying
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 206 - 211
  • [10] Multi-scale discretization of shape contours
    Prasad, L
    Rao, R
    VISION GEOMETRY IX, 2000, 4117 : 202 - 209