Mining Uncertain Data Streams Using Clustering Feature Decision Trees

被引:0
|
作者
Xu, Wenhua [1 ]
Qin, Zheng [2 ]
Hu, Hao [2 ]
Zhao, Nan [2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Sch Software, Beijing 100084, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
During the last decade, classification from data streams is based on deterministic learning algorithms which learn from precise and complete data. However, a multitude of practical applications only supply approximate measurements. Usually, the estimated errors of the measurements are available. The development of highly efficient algorithms dealing with uncertain examples has emerged as an new direction. In this paper, we build a CFDTu model from data streams having uncertain attribute values. CFDTu applies an uncertain clustering algorithm that scans the data stream only once to obtain the sufficient statistical summaries. The statistics are stored in the Clustering Feature vectors, and are used for incremental decision tree induction. The vectors also serve as classifiers at the leaves to further refine the classification and reinforce any-time property. Experiments show that CFDTu outperforms a purely deterministic method in terms of accuracy and is highly scalable on uncertain data streams.
引用
收藏
页码:195 / +
页数:3
相关论文
共 50 条
  • [1] Decision trees for mining data streams
    Gama, Joao
    Fernandes, Ricardo
    Rocha, Ricardo
    [J]. INTELLIGENT DATA ANALYSIS, 2006, 10 (01) : 23 - 45
  • [2] Mining data streams using clustering
    Lu, YH
    Huang, Y
    [J]. Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 2079 - 2083
  • [3] Mining decision trees from data streams in a mobile environment
    Kargupta, H
    Park, BH
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 281 - 288
  • [4] Decision Trees for Mining Data Streams Based on the Gaussian Approximation
    Rutkowski, Leszek
    Jaworski, Maciej
    Pietruczuk, Lena
    Duda, Piotr
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) : 108 - 119
  • [5] Mining patterns for clustering using unsupervised decision trees
    Gutierrez-Rodriguez, A. E.
    Martinez-Trinidad, J. Fco.
    Garcia-Borroto, M.
    Carrasco-Ochoa, J. A.
    [J]. INTELLIGENT DATA ANALYSIS, 2015, 19 (06) : 1297 - 1310
  • [6] Clustering feature decision trees for semi-supervised classification from high-speed data streams
    Xu, Wen-hua
    Qin, Zheng
    Chang, Yang
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2011, 12 (08): : 615 - 628
  • [7] Clustering feature decision trees for semi-supervised classification from high-speed data streams
    Wen-hua Xu
    Zheng Qin
    Yang Chang
    [J]. Journal of Zhejiang University SCIENCE C, 2011, 12 : 615 - 628
  • [8] Clustering feature decision trees for semi-supervised classification from high-speed data streams
    Wenhua XU Zheng QIN Yang CHANG Department of Computer Science and TechnologyTsinghua UniversityBeijing China School of SoftwareTsinghua UniversityBeijing China
    [J]. Journal of Zhejiang University-Science C(Computers & Electronics)., 2011, 12 (08) - 628
  • [10] Constructing Decision Trees for Mining High-speed Data Streams
    Xu Wenhua
    Qin Zheng
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2012, 21 (02) : 215 - 220