Split selection methods for classification trees

被引:22
|
作者
Loh, WY
Shih, YS
机构
[1] UNIV WISCONSIN,DEPT STAT,MADISON,WI 53706
[2] NATL CHUNG CHENG UNIV,DEPT MATH,CHIAYI 621,TAIWAN
关键词
decision trees; discriminant analysis; machine learning;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Classification trees based on exhaustive search algorithms tend to be biased towards selecting variables that afford more splits. As a result, such trees should be interpreted with caution. This article presents an algorithm called QUEST that has negligible bias. its split selection strategy shares similarities with the FACT method, but it yields binary splits and the final tree can be selected by a direct stopping rule or by pruning. Real and simulated data are used to compare QUEST with the exhaustive search approach. QUEST is shown to be substantially faster and the size and classification accuracy of its trees are typically comparable to those of exhaustive search.
引用
收藏
页码:815 / 840
页数:26
相关论文
共 50 条
  • [1] A note on split selection bias in classification trees
    Shih, YS
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2004, 45 (03) : 457 - 466
  • [2] Unbiased split selection for classification trees based on the Gini Index
    Strobl, Carohn
    Boulesteix, Anne-Laure
    Augustin, Thomas
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 483 - 501
  • [3] Cross Split Decision Trees For Pattern Classification
    Mirzamomen, Zahra
    Fekri, Mohammad Navid
    Kangavari, Mohammadreza
    [J]. 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2015, : 240 - 245
  • [4] Methods to combine classification trees
    Miglio, R
    Soffritti, G
    [J]. BETWEEN DATA SCIENCE AND APPLIED DATA ANALYSIS, 2003, : 65 - 73
  • [5] Split criterions for variable selection using decision trees
    Abellan, Joaquin
    Masegosa, Andres R.
    [J]. SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, PROCEEDINGS, 2007, 4724 : 489 - +
  • [6] A Bayesian Random Split to Build Ensembles of Classification Trees
    Cano, Andres
    Masegosa, Andres R.
    Moral, Serafin
    [J]. SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, PROCEEDINGS, 2009, 5590 : 469 - 480
  • [7] Partially Bayesian variable selection in classification trees
    Noe, Douglas A.
    He, Xuming
    [J]. STATISTICS AND ITS INTERFACE, 2008, 1 (01) : 155 - 167
  • [8] THE GENERAL CONCEPT OF THE METHODS OF ALGORITHMIC CLASSIFICATION TREES
    Povkhan, I. F.
    [J]. RADIO ELECTRONICS COMPUTER SCIENCE CONTROL, 2020, (03) : 108 - 120
  • [9] Simplifying classification trees through consensus methods
    Miglio, R
    Soffritti, G
    [J]. NEW DEVELOPMENTS IN CLASSIFICATION AND DATA ANALYSIS, 2005, : 31 - 37
  • [10] Association between split selection instability and predictive error in survival trees
    Radespiel-Troeger, M.
    Gefeller, O.
    Rabenstein, T.
    Hothorn, T.
    [J]. METHODS OF INFORMATION IN MEDICINE, 2006, 45 (05) : 548 - 556