Performance Improvement Validation of Decision Tree Algorithms with Non-normalized Information Distance in Experiments

被引:0
|
作者
Araki, Takeru [1 ]
Luo, Yuan [1 ]
Guo, Minyi [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
关键词
Decision tree; ID3; algorithm; Information distance; Information gain; Gain ratio;
D O I
10.1007/978-3-031-20862-1_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of ID3 algorithm in decision tree depends on the information gain but it has a drawback because of tending to select attributes with many values as the branching attributes. The gain ratio (especially in C4.5) is proposed to improve the information gain, but it does not always improve the performance, nor is it always defined. Some scientists use normalized information distance to improve the gain ratio, however, it is ineffective. In this paper, we investigate two non-normalized information distance selection criteria to replace the information gain and the gain ratio and conduct detailed experiments on 13 datasets classified into four types with theoretical analysis. Surprisingly, on the datasets where the number of values of each attribute differ greatly i.e. in Type1 and Type2, non-normalized information distance-based algorithms can increase the accuracy of about 15-25% of ID3 algorithm. The first reason is that more values for an attribute does not reduce the distances, which is suggested by Mantaras. The second reason is that the conditional entropy which is the opposite one used in the information gain can bring balance to the multi-valued biased values. Furthermore, our methods can maintain results comparable to those of existing algorithms on other cases. Compared to the gain ratio, the algorithms with non-normalized information distances conquer the drawback much better on Type1 datasets, which is strongly confirmed by experiments and corresponding analysis. It can be presumed that "normalization" improvement methods such as normalized information distance and the gain ratio are not always effective.
引用
收藏
页码:450 / 464
页数:15
相关论文
共 50 条
  • [41] Development and Validation of a Non-Invasive Prediction Model for Glioma-Associated Epilepsy A Comparative Analysis of Nomogram and Decision Tree
    Zhong, Zian
    Yu, Hong-Fei
    Tong, Yanfei
    Li, Jie
    INTERNATIONAL JOURNAL OF GENERAL MEDICINE, 2025, 18 : 1111 - 1125
  • [42] Understanding the relationship between team diversity and the innovative performance in research teams using decision tree algorithms: evidence from artificial intelligence
    Liu, Junwan
    Gong, Xiaoyun
    Xu, Shuo
    Huang, Chenchen
    SCIENTOMETRICS, 2024, 129 (12) : 7805 - 7831
  • [43] MERF v3.0, a highly computationally efficient non-hydrostatic ocean model with implicit parallelism: Algorithms and validation experiments
    Tang, Qiang
    Huang, Xiaomeng
    Lin, Lei
    Xiong, Wei
    Wang, Dong
    Wang, Mingqing
    Huang, Xing
    OCEAN MODELLING, 2021, 167
  • [44] Classification of severity of trachea stenosis from EEG signals using ordinal decision-tree based algorithms and ensemble-based ordinal and non-ordinal algorithms
    Singer, Gonen
    Ratnovsky, Anat
    Naftali, Sara
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 173
  • [45] Effect of accounting information system quality on decision-making success and non-financial performance: does non-financial information quality matter?
    Thuy, Ngoc Tran Thanh
    COGENT BUSINESS & MANAGEMENT, 2025, 12 (01):
  • [46] Performance Analysis of Non-Conventional Distance Protection and Phase Selection Algorithms in Transmission Lines Little Longer than HalfWavelength
    de Castro, Andre G.
    Soares, Lucas A.
    Araujo, Marcos R.
    Drumond, Webert B. F.
    Coelho, Aurelio L. M.
    Faria, Ivan P.
    2018 13TH IEEE INTERNATIONAL CONFERENCE ON INDUSTRY APPLICATIONS (INDUSCON), 2018, : 1008 - 1015
  • [47] A Study of Job Failure Prediction at Job Submit -State and Job Start -State in High -Performance Computing System: Using Decision Tree Algorithms
    Banjongkan, Anupong
    Pongsena, Watthana
    Kerdprasop, Nittaya
    Kerdprasop, Kittisak
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2021, 12 (02) : 84 - 92
  • [48] Comparison of the performance of decision tree (DT) algorithms and extreme learning machine (ELM) model in the prediction of water quality of the Upper Green River watershed
    Anmala, Jagadeesh
    Turuganti, Venkateswarlu
    WATER ENVIRONMENT RESEARCH, 2021, 93 (11) : 2360 - 2373
  • [49] Decision Tree-Based Automated Test-Bed for Performance Validation of Line Protection Relays Using a Hardware-in-the-Loop Architecture
    Quintero-Zuluaga, J. F.
    Viana-Villa, J. P.
    Villegas, D.
    Giraldo-Gomez, W-D.
    Arboleda, B. A.
    Sanchez, M.
    Perez, B. C.
    Duque, N.
    2020 IEEE COLOMBIAN CONFERENCE ON APPLICATIONS OF COMPUTATIONAL INTELLIGENCE (IEEE COLCACI 2020), 2020,
  • [50] The influence of non-rational factors on managerial decision-making and their impact on organizational performance in Indonesian Information Institutions
    Laksmi, Laksmi
    Laugu, Nurdin
    Fauziah, Kiki
    Hoogervorst, Tom
    JOURNAL OF DECISION SYSTEMS, 2024, 33 (03) : 501 - 529