Ambiguous decision trees for mining concept-drifting data streams

被引:30
|
作者
Liu, Jing [2 ]
Li, Xue [1 ]
Zhong, Weicai [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] Xidian Univ, Inst Intelligent Informat Proc, Xian 710071, Peoples R China
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
Data streams; Data mining; Concept drift; Ambiguous decision trees; Incremental learning;
D O I
10.1016/j.patrec.2009.07.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real world situations, explanations for the same observations may be different depending on perceptions or contexts. They may change with time especially when concept drift occurs. This phenomenon incurs ambiguities. It is useful if an algorithm can learn to reflect ambiguities and select the best decision according to context or situation. Based on this viewpoint, we study the problem of deriving ambiguous decision trees from data streams to cope with concept drift. CVFDT (Concept-adapting Very Fast Decision Tree) is one of the most well-known streaming data mining methods that can learn decision trees incrementally. In this paper, we establish a method called ambiguous CVFDT (aCVFDT), which integrates ambiguities into CVFDT by exploring multiple options at each node whenever a node is to be split. When aCVFDT is used to make class predictions, it is guaranteed that the best and newest knowledge is used. When old concepts recur, aCVFDT can immediately relearn them by using the corresponding options recorded at each node. Furthermore, CVFDT does not automatically detect occurrences of concept drift and only scans trees periodically, whereas an automatic concept drift detecting mechanism is used in aCVFDT. In our experiments, hyperplane problem and two benchmark problems from the UCI KIDD Archive, namely Network Intrusion and Forest CoverType, are used to validate the performance of aCVFDT. The experimental results show that aCVFDT obtains significantly improved results over traditional CVFDT. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:1347 / 1355
页数:9
相关论文
共 50 条
  • [41] Tensor decision trees for continual learning from drifting data streams
    Bartosz Krawczyk
    Machine Learning, 2021, 110 : 3015 - 3035
  • [42] Tensor decision trees for continual learning from drifting data streams
    Krawczyk, Bartosz
    MACHINE LEARNING, 2021, 110 (11-12) : 3015 - 3035
  • [43] Catching the Trend: A Framework for Clustering Concept-Drifting Categorical Data
    Chen, Hung-Leng
    Chen, Ming-Syan
    Lin, Su-Chen
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (05) : 652 - 665
  • [44] Cost-Sensitive Perceptron Decision Trees for Imbalanced Drifting Data Streams
    Krawczyk, Bartosz
    Skryjomski, Przemyslaw
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT II, 2017, 10535 : 512 - 527
  • [45] Decision Trees for Mining Data Streams Based on the Gaussian Approximation
    Rutkowski, Leszek
    Jaworski, Maciej
    Pietruczuk, Lena
    Duda, Piotr
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) : 108 - 119
  • [46] Mining decision trees from data streams in a mobile environment
    Kargupta, H
    Park, BH
    2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 281 - 288
  • [47] Decision tree-based Feature Ranking in Concept Drifting Data Streams
    Pereira Karax, Jean Antonio
    Malucelli, Andreia
    Barddal, Jean Paul
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 590 - 592
  • [48] Learning Decision Trees from Data Streams with Concept Drift
    Jankowski, Dariusz
    Jackowski, Konrad
    Cyganek, Boguslaw
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 : 1682 - 1691
  • [49] Mining decision rules on data streams in the presence of concept drifts
    Tsai, Cheng-Jung
    Lee, Chien-I.
    Yang, Wei-Pang
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1164 - 1178
  • [50] Concept Drifting Detection on Noisy Streaming Data in Random Ensemble Decision Trees
    Li, Peipei
    Hu, Xuegang
    Liang, Qianghui
    Gao, Yunjun
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2009, 5632 : 236 - +