Sampling methods in decision trees

被引:0
|
作者
Mehrotra, KG [1 ]
Jeragh, M [1 ]
机构
[1] Syracuse Univ, Dept EECS, Syracuse, NY 13210 USA
关键词
cart; Gini-index; decision tree; random sampling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
CART is widely used by researchers in data mining applications. However, for a very large data set building CART is nearly an impossible task. We argue that a tree built entirely from a large simple random sample from the data set is reasonable. In addition, we propose two new algorithms; the parabola method that relies on the property that the Gini-index is a smooth function in any continuous attribute thus allowing accurate approximation of the minimum Gini-index and the double sampling method useful when a data set is very large. Experimental results show that these two method perform extremely well.
引用
收藏
页码:1069 / 1075
页数:7
相关论文
共 50 条
  • [31] A comparison of predictive methods in extinction risk studies: Contrasts and decision trees
    Sullivan, Matthew S.
    Jones, Martin J.
    Lee, David C.
    Marsden, Stuart J.
    Fielding, Alan H.
    Young, Emily V.
    [J]. BIODIVERSITY AND CONSERVATION, 2006, 15 (06) : 1977 - 1991
  • [32] Comparing different approximation methods for remaining life expectancy in decision trees
    Siebert, U
    Conrads-Frank, A
    [J]. VALUE IN HEALTH, 2004, 7 (06) : 772 - 772
  • [33] Decision trees
    de Ville, Barry
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2013, 5 (06): : 448 - 455
  • [34] EFFECT OF BORDERLINE TREES IN POPULATION PARAMETERS ESTIMATED BY VARIABLE SAMPLING AREA METHODS
    Mendes Nascimento, Rodrigo Geroni
    Rodrigues da Silva, Luis Cesar
    Soares Barbeiro, Laercio da Silveira
    Wojciechowski, Julio Cesar
    Netto, Sylvio Pellico
    Machado, Sebastiao do Amaral
    [J]. CERNE, 2015, 21 (01) : 125 - 131
  • [35] A Comparison of Predictive Methods in Extinction Risk Studies: Contrasts and Decision Trees
    Matthew S. Sullivan
    Martin J. Jones
    David C. Lee
    Stuart J. Marsden
    Alan H. Fielding
    Emily V. Young
    [J]. Biodiversity & Conservation, 2006, 15 : 1977 - 1991
  • [36] Rule Extraction from Ensemble Methods Using Aggregated Decision Trees
    Al Iqbal, Md Ridwan
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2012, PT II, 2012, 7664 : 599 - 607
  • [37] DECISION TREES
    CONCHA, SR
    [J]. TRIAL, 1984, 20 (10): : 6 - 6
  • [38] A comparative study of pruned decision trees and fuzzy decision trees
    Benbrahim, H
    Bensaid, A
    [J]. PEACHFUZZ 2000 : 19TH INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 2000, : 227 - 231
  • [39] MAPTree: Beating "Optimal" Decision Trees with Bayesian Decision Trees
    Sullivan, Colin
    Tiwari, Mo
    Thrun, Sebastian
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 9019 - 9026
  • [40] Decision by sampling
    Stewart, Neil
    Chater, Nick
    Brown, Gordon D. A.
    [J]. COGNITIVE PSYCHOLOGY, 2006, 53 (01) : 1 - 26