A novel discretization algorithm based on multi-scale and information entropy

被引:0
|
作者
Yaling Xun
Qingxia Yin
Jifu Zhang
Haifeng Yang
Xiaohui Cui
机构
[1] Taiyuan University of Science and Technology (TYUST),
来源
Applied Intelligence | 2021年 / 51卷
关键词
Data mining; Discretization; Information entropy; Multi-scale; MDLPC criterion;
D O I
暂无
中图分类号
学科分类号
摘要
Discretization is one of the data preprocessing topics in the field of data mining, and is a critical issue to improve the efficiency and quality of data mining. Multi-scale can reveal the structure and hierarchical characteristics of data objects, the representation of the data in different granularities will be obtained if we make a reasonable hierarchical division for a research object. The multi-scale theory is introduced into the process of data discretization and a data discretization method based on multi-scale and information entropy called MSE is proposed. MSE first conducts scale partition on the domain attribute to obtain candidate cut point set with different granularity. Then, the information entropy is applied to the candidate cut point set, and the candidate cut point with the minimum information entropy is selected and detected in turn to determine the final cut point set using the MDLPC criterion. In such way, MSE avoids the problem that the candidate cut points are limited to only certain limited attribute values caused by considering only the statistical attribute values in the traditional discretization methods, and reduces the number of candidates by controlling the data division hierarchy to an optimal range. Finally, the extensive experiments show that MSE achieves high performance in terms of discretization efficiency and classification accuracy, especially when it is applied to support vector machines, random forest, and decision trees.
引用
收藏
页码:991 / 1009
页数:18
相关论文
共 50 条
  • [31] Semi-Global Stereo Matching Algorithm Based on Multi-Scale Information Fusion
    Deng, Changgen
    Liu, Deyuan
    Zhang, Haodong
    Li, Jinrong
    Shi, Baojun
    APPLIED SCIENCES-BASEL, 2023, 13 (02):
  • [32] Multi-scale Entropy and Renyi Cross Entropy Based Traffic Anomaly Detection
    Yan, Ruoyu
    Zheng, Qinghua
    Peng, Weimin
    2008 11TH IEEE SINGAPORE INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS (ICCS), VOLS 1-3, 2008, : 554 - +
  • [33] Novel classification method for remote sensing images based on information entropy discretization algorithm and vector space model
    Xie, Li
    Li, Guangyao
    Xiao, Mang
    Peng, Lei
    COMPUTERS & GEOSCIENCES, 2016, 89 : 252 - 259
  • [34] Multi-scale zerotree entropy coding
    Sodagar, I
    Lee, HJ
    Hatrack, P
    Chai, BB
    ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL I: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 311 - 314
  • [35] Identifying User Behavior on Twitter Based on Multi-scale Entropy
    He, Su
    Wang, Hui
    Jiang, Zhi Hong
    2014 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2014, : 381 - 384
  • [36] Complexity analysis of traffic flow based on multi-scale entropy
    Xiang Zheng-Tao
    Chen Yu-Feng
    Li Yu-Jin
    Xiong Li
    ACTA PHYSICA SINICA, 2014, 63 (03)
  • [37] Research on the Multi-scale Fuzzy Entropy based on Index Energy
    Wei, Zhonglin
    Ma, Liyuan
    Cui, Xinhan
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS RESEARCH AND MECHATRONICS ENGINEERING, 2015, 121 : 2138 - 2143
  • [38] Multi-scale Entropy Based Traffic Analysis and Anomaly Detection
    Ruo-Yu, Yan
    Qing-Hua, Zheng
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 2, PROCEEDINGS, 2008, : 151 - 157
  • [39] Unconscious Emotion Recognition based on Multi-scale Sample Entropy
    Shi, Yanjing
    Zheng, Xiangwei
    Li, Tiantian
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 1221 - 1226
  • [40] Shape information based multi-scale watershed transform
    Mei, Tiancan
    Li, Deren
    Qin, Qianqing
    GEOINFORMATICS 2006: GEOSPATIAL INFORMATION TECHNOLOGY, 2006, 6421