A framework for bottom-up induction of oblique decision trees

被引:9
|
作者
Banos, Rodrigo C. [1 ]
Jaskowiak, Pablo A. [2 ]
Cerri, Ricardo [2 ]
de Carvalho, Andre C. P. L. F. [2 ]
机构
[1] Pontificia Univ Catolica Rio Grande do Sal, Fac Informat, BR-90679900 Porto Alegre, RS, Brazil
[2] Univ Sao Paulo, Inst Ciencias Matemat & Comp ICMC, BR-13560970 Sao Carlos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
Oblique decision trees; Bottom-up induction; Clustering; CLASSIFICATION; DESIGN;
D O I
10.1016/j.neucom.2013.01.067
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The vast majority of the oblique and univariate decision-tree induction algorithms employ a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose BUTIF-a novel Bottom-Up Oblique Decision-Tree Induction Framework. BUTIF does not rely on an impurity-measure for dividing nodes, since the data resulting from each split is known a priori. For generating the initial leaves of the tree and the splitting hyperplanes in its internal nodes, BUTIF allows the adoption of distinct clustering algorithms and binary classifiers, respectively. It is also capable of performing embedded feature selection, which may reduce the number of features in each hyperplane, thus improving model comprehension. Different from virtually every top-down decision-tree induction algorithm, BUTIF does not require the further execution of a pruning procedure in order to avoid overfitting, due to its bottom-up nature that does not overgrow the tree. We compare distinct instances of BUTIF to traditional univariate and oblique decision-tree induction algorithms. Empirical results show the effectiveness of the proposed framework. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:3 / 12
页数:10
相关论文
共 50 条
  • [1] Bottom-up fuzzy partitioning in fuzzy decision trees
    Fajfer, Maciej
    Janikow, Cezary Z.
    [J]. Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, 2000, : 326 - 330
  • [2] Bottom-up fuzzy partitioning in fuzzy decision trees
    Fajfer, M
    Janikow, CZ
    [J]. PEACHFUZZ 2000 : 19TH INTERNATIONAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 2000, : 326 - 330
  • [3] BOTTOM-UP RECURSION IN TREES
    CASAS, R
    STEYAERT, JM
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1986, 214 : 172 - 182
  • [4] Protein Classification Using Decision Trees With Bottom-up Classification Approach
    Pepik, Bojan
    Kalajdziski, Slobodan
    Davcev, Danco
    [J]. 13TH INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING, VOLS 1-3, 2009, 23 (1-3): : 174 - 178
  • [5] Generalised bottom-up pruning: A model level combination of decision trees
    Eastwood, Mark
    Gabrys, Bogdan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9150 - 9158
  • [6] A System for Induction of Oblique Decision Trees
    Murthy, Sreerama K.
    Kasif, Simon
    Salzberg, Steven
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1994, 2 : 1 - 32
  • [7] Bottom-Up Induction of Feature Terms
    Eva Armengol
    Enric Plaza
    [J]. Machine Learning, 2000, 41 : 259 - 294
  • [8] Bottom-up induction of feature terms
    Armengol, E
    Plaza, E
    [J]. MACHINE LEARNING, 2000, 41 (03) : 259 - 294
  • [9] An efficient bottom-up distance between trees
    Valiente, G
    [J]. EIGHTH SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2001, : 212 - 219
  • [10] BOAI: Fast alternating decision tree induction based on bottom-up evaluation
    Yang, Bishan
    Wang, Tengjiao
    Yang, Dongqing
    Chang, Lei
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 405 - +