Splitting Choice and Computational Complexity Analysis of Decision Trees

被引:1
|
作者
Zhao, Xin [1 ]
Nie, Xiaokai [2 ]
机构
[1] Southeast Univ, Sch Math, Nanjing 211189, Peoples R China
[2] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
decision tree; splitting bias; splitting criteria; computational complexity; noise variable; CLASSIFICATION; SELECTION;
D O I
10.3390/e23101241
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Some theories are explored in this research about decision trees which give theoretical support to the applications based on decision trees. The first is that there are many splitting criteria to choose in the tree growing process. The splitting bias that influences the criterion chosen due to missing values and variables with many possible values has been studied. Results show that the Gini index is superior to entropy information as it has less bias regarding influences. The second is that noise variables with more missing values have a better chance to be chosen while informative variables do not. The third is that when there are many noise variables involved in the tree building process, it influences the corresponding computational complexity. Results show that the computational complexity increase is linear to the number of noise variables. So methods that decompose more information from the original data but increase the variable dimension can also be considered in real applications.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] On using Bayesian networks for complexity reduction in decision trees
    Adriana Brogini
    Debora Slanzi
    [J]. Statistical Methods and Applications, 2010, 19 : 127 - 139
  • [42] Computational Complexity of Fundamental Problems in Social Choice Theory
    Dey, Palash
    Narahari, Y.
    Bhattacharyya, Arnab
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1973 - 1974
  • [43] The computational complexity of rationalizing Pareto optimal choice behavior
    Thomas Demuynck
    [J]. Social Choice and Welfare, 2014, 42 : 529 - 549
  • [44] ON THE COMPLEXITY OF BRANCHING PROGRAMS AND DECISION TREES FOR CLIQUE FUNCTIONS
    WEGENER, I
    [J]. JOURNAL OF THE ACM, 1988, 35 (02) : 461 - 471
  • [45] The computational complexity of rationalizing Pareto optimal choice behavior
    Demuynck, Thomas
    [J]. SOCIAL CHOICE AND WELFARE, 2014, 42 (03) : 529 - 549
  • [46] The impact of the change in the splitting method of decision trees on the prediction power
    Chang, Youngjae
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2022, 35 (04) : 517 - 525
  • [47] UNIFYING ATTRIBUTE SPLITTING CRITERIA OF DECISION TREES BY TSALLIS ENTROPY
    Wang, Yisen
    Xia, Shu-Tao
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2507 - 2511
  • [48] Generalized conditional entropy and a metric splitting criterion for decision trees
    Simovici, Dan A.
    Jaroszewicz, Szymon
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 35 - 44
  • [49] New Splitting Criteria for Decision Trees in Stationary Data Streams
    Jaworski, Maciej
    Duda, Piotr
    Rutkowski, Leszek
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2516 - 2529
  • [50] A parallel tree node splitting criterion for fuzzy decision trees
    Mu, Yashuang
    Liu, Xiaodong
    Wang, Lidong
    Asghar, Aamer Bilal
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (17):