Ambiguous decision trees for mining concept-drifting data streams

被引:30
|
作者
Liu, Jing [2 ]
Li, Xue [1 ]
Zhong, Weicai [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] Xidian Univ, Inst Intelligent Informat Proc, Xian 710071, Peoples R China
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
Data streams; Data mining; Concept drift; Ambiguous decision trees; Incremental learning;
D O I
10.1016/j.patrec.2009.07.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real world situations, explanations for the same observations may be different depending on perceptions or contexts. They may change with time especially when concept drift occurs. This phenomenon incurs ambiguities. It is useful if an algorithm can learn to reflect ambiguities and select the best decision according to context or situation. Based on this viewpoint, we study the problem of deriving ambiguous decision trees from data streams to cope with concept drift. CVFDT (Concept-adapting Very Fast Decision Tree) is one of the most well-known streaming data mining methods that can learn decision trees incrementally. In this paper, we establish a method called ambiguous CVFDT (aCVFDT), which integrates ambiguities into CVFDT by exploring multiple options at each node whenever a node is to be split. When aCVFDT is used to make class predictions, it is guaranteed that the best and newest knowledge is used. When old concepts recur, aCVFDT can immediately relearn them by using the corresponding options recorded at each node. Furthermore, CVFDT does not automatically detect occurrences of concept drift and only scans trees periodically, whereas an automatic concept drift detecting mechanism is used in aCVFDT. In our experiments, hyperplane problem and two benchmark problems from the UCI KIDD Archive, namely Network Intrusion and Forest CoverType, are used to validate the performance of aCVFDT. The experimental results show that aCVFDT obtains significantly improved results over traditional CVFDT. (C) 2009 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:1347 / 1355
页数:9
相关论文
共 50 条
  • [1] Mining Concept-Drifting Data Streams with Multiple Semi-Random Decision Trees
    Li, Peipei
    Hu, Xuegang
    Wu, Xindong
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 733 - 740
  • [2] Random Ensemble Decision Trees for Learning Concept-Drifting Data Streams
    Li, Peipei
    Wu, Xindong
    Liang, Qianhui
    Hu, Xuegang
    Zhang, Yuhong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6634 : 313 - 325
  • [3] Learning concept-drifting data streams with random ensemble decision trees
    Li, Peipei
    Wu, Xindong
    Hu, Xuegang
    Wang, Hao
    NEUROCOMPUTING, 2015, 166 : 68 - 83
  • [4] An efficient and sensitive decision tree approach to mining concept-drifting data streams
    Tsai, Cheng-Jurig
    Lee, Chien-I
    Yang, Wei-Pang
    INFORMATICA, 2008, 19 (01) : 135 - 156
  • [5] On reducing classifier granularity in mining concept-drifting data streams
    Wang, P
    Wang, HX
    Wu, XC
    Wang, W
    Shi, BL
    FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 474 - 481
  • [6] Mining Concept-Drifting Data Streams Containing Labeled and Unlabeled Instances
    Borchani, Hanen
    Larranaga, Pedro
    Bielza, Concha
    TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS, 2010, 6096 : 531 - 540
  • [7] Mining Concept-Drifting and Noisy Data Streams using Ensemble Classifiers
    Ouyang, Zhenzheng
    Zhou, Min
    Wang, Tao
    Wu, Quanyuan
    2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, VOL IV, PROCEEDINGS, 2009, : 360 - +
  • [8] A general framework for mining concept-drifting data streams with evolvable features
    Peng, Jiaqi
    Guo, Jinxia
    Yang, Qinli
    Lu, Jianyun
    Shao, Junmming
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1276 - 1281
  • [9] A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
    Gao, Jing
    Fan, Wei
    Han, Jiawei
    Yu, Philip S.
    PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 3 - +
  • [10] An Algorithm for Anticipating Future Decision Trees from Concept-Drifting Data
    Boettcher, Mirko
    Spott, Martin
    Kruse, Rudolf
    RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXV, 2009, : 293 - +