ImbTreeEntropy: An R package for building entropy-based classification trees on imbalanced datasets

被引:2
|
作者
Gajowniczek, Krzysztof [1 ]
Zabkowski, Tomasz [1 ]
机构
[1] Warsaw Univ Life Sci SGGW, Inst Informat Technol, Dept Artificial Intelligence, PL-02776 Warsaw, Poland
关键词
Decision trees; Generalized entropy; Cost-sensitive learning; Imbalanced data;
D O I
10.1016/j.softx.2021.100841
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose a novel R package, named ImbTreeEntropy, for building binary and multiclass decision trees using generalized entropy functions, such as Renyi, Tsallis, Sharma-Mittal, Sharma-Taneja and Kapur, to measure the impurity of a node. These are important extensions of the existing algorithms that usually employ Shannon entropy and the concept of information gain. Additionally, ImbTreeEntropy is able to handle imbalanced data, which is a challenging issue in many practical applications. The package supports cost-sensitive learning by defining a misclassification cost matrix and weighted sensitive learning. It accepts all types of attributes, including continuous, ordered and nominal attributes. The package and its code are made freely available. (C) 2021 The Authors. Published by Elsevier B.V.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Entropy-based symbolic representation for time series classification
    Chen, Xiao-yun
    Ye, Dong-yi
    Hu, Xiao-Lin
    [J]. FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 754 - 760
  • [32] Entropy-based classification approach for personalized privacy anonymity
    Wang, Bo
    Yang, Jing
    Zhang, Jian-Pei
    [J]. Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2013, 43 (01): : 179 - 185
  • [33] PPtreeViz: An R Package for Visualizing Projection Pursuit Classification Trees
    Lee, Eun-Kyung
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2018, 83 (08): : 1 - 30
  • [34] Texture Entropy-Based Classification for Iris Recognition Systems
    Papic, Veljko
    Krmar, Jelena
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2020, 29 (04)
  • [35] Entropy-based fuzzy rough classification approach for extracting classification rules
    Tsai, Ying-Chieh
    Cheng, Ching-Hsue
    Chang, Jing-Rong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2006, 31 (02) : 436 - 443
  • [36] A Comparison of Non-symmetric Entropy-based Classification trees and Support Vector Machine for Cardiovascular Risk Stratification
    Singh, Anima
    Guttag, John V.
    [J]. 2011 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2011, : 79 - 82
  • [37] Study on source of classification in imbalanced datasets based on new ensemble classifier
    Zhai, Yun
    Yang, Bing-Ru
    Qu, Wu
    Sui, Hai-Feng
    [J]. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2011, 33 (01): : 196 - 201
  • [38] Weighted Conditional Mutual Information Based Boosting for Classification of Imbalanced Datasets
    Utasi, Akos
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2711 - 2714
  • [39] Granular Classification for Imbalanced Datasets: A Minkowski Distance-Based Method
    Fu, Chen
    Yang, Jianhua
    [J]. ALGORITHMS, 2021, 14 (02)
  • [40] A Similarity Measurement with Entropy-Based Weighting for Clustering Mixed Numerical and Categorical Datasets
    Que, Xia
    Jiang, Siyuan
    Yang, Jiaoyun
    An, Ning
    [J]. ALGORITHMS, 2021, 14 (06)