Handling numeric attributes in Hoeffding trees

被引:0
|
作者
Pfahringer, Bernhard [1 ]
Holmes, Geoffrey [1 ]
Kirkby, Richard [1 ]
机构
[1] Univ Waikato, Hamilton, New Zealand
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For conventional machine learning classification algorithms handling numeric attributes is relatively straightforward. Unsupervised and supervised solutions exist that either segment the data into pre-defined bins or sort the data and search for the best split points. Unfortunately, none of these solutions carry over particularly well to a data stream environment. Solutions for data streams have been proposed by several authors but as yet none have been compared empirically. In this paper we investigate a range of methods for multi-class tree-based classification where the handling of numeric attributes takes place as the tree is constructed. To this end, we extend an existing approximation approach, based on simple Gaussian approximation. We then compare this method with four approaches from the literature arriving at eight final algorithm configurations for testing. The solutions cover a range of options from perfectly accurate and memory intensive to highly approximate. All methods are tested using the Hoeffding tree classification algorithm. Surprisingly, the experimental comparison shows that the most approximate methods produce the most accurate trees by allowing for faster tree growth.
引用
收藏
页码:296 / 307
页数:12
相关论文
共 50 条
  • [21] SOAP: Efficient feature selection of numeric attributes
    Ruiz, R
    Aguilar-Ruiz, JS
    Riquelme, JC
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS, 2002, 2527 : 233 - 242
  • [22] Knowledge Graph Embedding with Numeric Attributes of Entities
    Wu, Yanrong
    Wang, Zhichun
    REPRESENTATION LEARNING FOR NLP, 2018, : 132 - 136
  • [23] Mining optimized association rules for numeric attributes
    Fukuda, T
    Morimoto, Y
    Morishita, S
    Tokuyama, T
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) : 1 - 12
  • [24] Mining optimized gain rules for numeric attributes
    Brin, S
    Rastogi, R
    Shim, K
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (02) : 324 - 338
  • [25] Mining optimized support rules for numeric attributes
    Rastogi, R
    Shim, K
    INFORMATION SYSTEMS, 2001, 26 (06) : 425 - 444
  • [26] HANDLING UNCERTAIN INFORMATION: A REVIEW OF NUMERIC AND NON-NUMERIC METHODS.
    Bhatnagar, Raj K.
    Kanal, Laveen N.
    1986, 4 : 3 - 26
  • [27] Mining optimized support rules for numeric attributes
    Rastogi, R
    Shim, K
    15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, : 206 - 215
  • [28] Handling numeric criteria in relaxed planning graphs
    Sapena, O
    Onaindía, E
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2004, 2004, 3315 : 114 - 123
  • [29] Probabilistic Hoeffding Trees Sped-Up Convergence and Adaption of Online Trees on Changing Data Streams
    Boidol, Jonathan
    Hapfelmeier, Andreas
    Tresp, Volker
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, ICDM 2015, 2015, 9165 : 94 - 108
  • [30] Hoeffding adaptive trees for multi-label classification on data streams
    Esteban, Aurora
    Cano, Alberto
    Zafra, Amelia
    Ventura, Sebastian
    KNOWLEDGE-BASED SYSTEMS, 2024, 304