An Improved Algorithm of Decision Trees for Streaming Data Based on VFDT

被引:3
|
作者
Li, Feixiong [1 ]
Liu, Quan [1 ]
机构
[1] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou 215006, Peoples R China
关键词
Streaming Data Mining; Decision Trees; Unequal Interval Numerical Pruning(UINP); Naive Bayes Classifiers;
D O I
10.1109/ISISE.2008.256
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Decision tree is a good model of Classification. Recently, there has been much interest in mining streaming data. Because streaming data is large and no limited, it is unpractical that passing the entire data over more than one time. A one pass online algorithm is necessary. One of the most successful algorithms for mining data streams is VFDT(Very Fast Decision Tree). we extend the VFDT system to EVFDT(Efficient-VFDT) in two directions: (1)We present Uneven Interval Numerical Pruning (shortly UINP) approach for efficiently processing numerical attributes. (2)We use naive Bayes classifiers associated with the node to process the samples to detect the outlying samples and reduce the scale of the trees. From the experimental comparison, the two techniques significantly improve the efficiency and the accuracy of decision tree construction on streaming data.
引用
收藏
页码:597 / 600
页数:4
相关论文
共 50 条
  • [21] An improved clustering algorithm for minimum spanning trees in multidimensional data
    Xie, Zhi-Qiang
    Yu, Liang
    Yang, Jing
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2008, 29 (08): : 851 - 857
  • [22] Multiple Early-Termination Scheme for TZSearch Algorithm based on Data Mining and Decision Trees
    Goncalves, Paulo
    Correa, Guilherme
    Porto, Marcelo
    Zatt, Bruno
    Agostini, Luciano
    2017 IEEE 19TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2017,
  • [23] A coverage based ensemble algorithm (CBEA) for streaming data
    Rushing, J
    Graves, S
    Criswell, E
    Lin, A
    ICTAI 2004: 16TH IEEE INTERNATIONALCONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, : 106 - 112
  • [24] A streaming parallel decision tree algorithm
    Ben-Haim, Yael
    Tom-Tov, Elad
    Journal of Machine Learning Research, 2010, 11 : 849 - 872
  • [25] A Streaming Parallel Decision Tree Algorithm
    Ben-Haim, Yael
    Tom-Tov, Elad
    JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 849 - 872
  • [26] A new online learning algorithm for streaming data and decision support with a Bayesian approach
    Huang, Kai
    Weng, Jiaying
    Wang, Chao
    Li, Mingfei
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (11) : 2483 - 2499
  • [27] Modeling the Functioning of Decision Trees Based on Decision Rule Systems by Greedy Algorithm
    Durdymyradov, Kerven
    Moshkov, Mikhail
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, PT II, ICCCI 2024, 2024, 14811 : 153 - 162
  • [28] A Population Based Algorithm and Fuzzy Decision Trees for Nonlinear Modeling
    Dziwinski, Piotr
    Bartczuk, Lukasz
    Przybyszewski, Krzysztof
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2018), PT II, 2018, 10842 : 516 - 531
  • [29] Face Detection Algorithm Based on a Cascade of Ensembles of Decision Trees
    Lebedev, Anton
    Pavlov, Vladimir
    Khryashchev, Vladimir
    Stepanova, Olga
    2016 18TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION AND SEMINAR ON INFORMATION SECURITY AND PROTECTION OF INFORMATION TECHNOLOGY (FRUCT-ISPIT), 2016, : 161 - 166
  • [30] AN IMPROVED ALGORITHM FOR STEINER TREES
    TRIETSCH, D
    HWANG, F
    SIAM JOURNAL ON APPLIED MATHEMATICS, 1990, 50 (01) : 244 - 263