An Improved Algorithm of Decision Trees for Streaming Data Based on VFDT

被引:3
|
作者
Li, Feixiong [1 ]
Liu, Quan [1 ]
机构
[1] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou 215006, Peoples R China
关键词
Streaming Data Mining; Decision Trees; Unequal Interval Numerical Pruning(UINP); Naive Bayes Classifiers;
D O I
10.1109/ISISE.2008.256
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Decision tree is a good model of Classification. Recently, there has been much interest in mining streaming data. Because streaming data is large and no limited, it is unpractical that passing the entire data over more than one time. A one pass online algorithm is necessary. One of the most successful algorithms for mining data streams is VFDT(Very Fast Decision Tree). we extend the VFDT system to EVFDT(Efficient-VFDT) in two directions: (1)We present Uneven Interval Numerical Pruning (shortly UINP) approach for efficiently processing numerical attributes. (2)We use naive Bayes classifiers associated with the node to process the samples to detect the outlying samples and reduce the scale of the trees. From the experimental comparison, the two techniques significantly improve the efficiency and the accuracy of decision tree construction on streaming data.
引用
收藏
页码:597 / 600
页数:4
相关论文
共 50 条
  • [1] An Improved Error-Based Pruning Algorithm of Decision Trees on Large Data Sets
    Peng, Yi
    Lu, Yu-Tong
    Chen, Zhi-Guang
    2021 IEEE 6TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2021), 2021, : 33 - 37
  • [2] VERY FAST DECISION TREE (VFDT) ALGORITHM ON HADOOP
    Desai, Sharmishta
    Roy, Sourav
    Patel, Brina
    Purandare, Samruddhi
    Kucheria, Minal
    2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2016,
  • [3] Incremental Learning of Fuzzy Decision Trees for Streaming Data Classification
    Pecori, Riccardo
    Ducange, Pietro
    Marcelloni, Francesco
    PROCEEDINGS OF THE 11TH CONFERENCE OF THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY (EUSFLAT 2019), 2019, 1 : 748 - 755
  • [4] Generating Decision Trees Method Based on Improved ID3 Algorithm
    Yang Ming
    Guo Shuxu
    Wang Jun
    CHINA COMMUNICATIONS, 2011, 8 (05) : 151 - 156
  • [5] Data mining algorithm selection: decision trees
    Cormier-Chisholm, J
    Sebastian, C
    OIL & GAS JOURNAL, 2003, 101 (04) : 34 - 38
  • [6] Confidence Decision Trees via Online and Active Learning for Streaming Data
    De Rosa, Rocco
    Cesa-Bianchi, Nicolo
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2017, 60 : 1031 - 1055
  • [7] Streaming Decision Trees for Lifelong Learning
    Korycki, Lukasz
    Krawczyk, Bartosz
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 502 - 518
  • [8] Interval forecasts based on regression trees for streaming data
    Xin Zhao
    Stuart Barber
    Charles C. Taylor
    Zoka Milan
    Advances in Data Analysis and Classification, 2021, 15 : 5 - 36
  • [9] Interval forecasts based on regression trees for streaming data
    Zhao, Xin
    Barber, Stuart
    Taylor, Charles C.
    Milan, Zoka
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2021, 15 (01) : 5 - 36
  • [10] A labeling algorithm based on a forest of decision trees
    T. Chabardès
    P. Dokládal
    M. Bilodeau
    Journal of Real-Time Image Processing, 2020, 17 : 1527 - 1545