Performance analysis of Hoeffding trees in data streams by using massive online analysis framework

被引:8
|
作者
Srimani, P. K. [1 ]
Patil, Malini M. [2 ]
机构
[1] Bangalore Univ, R&D Div, Bangalore 560056, Karnataka, India
[2] JSS Acad Tech Educ, Dept Informat Sci & Engn, Uttaralli Kengeri Main Rd, Bangalore 560060, Karnataka, India
关键词
data mining; data streams; static streams; evolving streams; Hoeffding trees; classification; supervised learning; massive online analysis; MOA; framework; massive data mining; MDM; dataset generators;
D O I
10.1504/IJDMMM.2015.073865
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Present work is mainly concerned with the understanding of the problem of classification from the data stream perspective on evolving streams using massive online analysis framework with regard to different Hoeffding trees. Advancement of the technology both in the area of hardware and software has led to the rapid storage of data in huge volumes. Such data is referred to as a data stream. Traditional data mining methods are not capable of handling data streams because of the ubiquitous nature of data streams. The challenging task is how to store, analyse and visualise such large volumes of data. Massive data mining is a solution for these challenges. In the present analysis five different Hoeffding trees are used on the available eight dataset generators of massive online analysis framework and the results predict that stagger generator happens to be the best performer for different classifiers.
引用
收藏
页码:293 / 313
页数:21
相关论文
共 50 条
  • [1] Restructuring of Hoeffding Trees for Trapezoidal Data Streams
    Schreckenberger, Christian
    Glockner, Tim
    Stuckenschmidt, Heiner
    Bartelt, Christian
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 416 - 423
  • [2] Probabilistic Hoeffding Trees Sped-Up Convergence and Adaption of Online Trees on Changing Data Streams
    Boidol, Jonathan
    Hapfelmeier, Andreas
    Tresp, Volker
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, ICDM 2015, 2015, 9165 : 94 - 108
  • [3] Learning Regularized Hoeffding Trees from Data Streams
    Barddal, Jean Paul
    Enembreck, Fabricio
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 574 - 581
  • [4] Accurate Ensembles for Data Streams: Combing Restricted Hoeffding Trees using Stacking
    Bifet, Albert
    Frank, Eibe
    Holmes, Geoffrey
    Pfahringer, Bernhard
    PROCEEDINGS OF 2ND ASIAN CONFERENCE ON MACHINE LEARNING (ACML2010), 2010, 13 : 225 - 240
  • [5] Hoeffding adaptive trees for multi-label classification on data streams
    Esteban, Aurora
    Cano, Alberto
    Zafra, Amelia
    Ventura, Sebastian
    KNOWLEDGE-BASED SYSTEMS, 2024, 304
  • [6] Scalable Visualization and Interactive Analysis using Massive Data Streams
    Pascucci, Valerio
    Bremer, Peer-Timo
    Gyulassy, Attila
    Scorzelli, Giorgio
    Christensen, Cameron
    Summa, Brian
    Kumar, Sidharth
    CLOUD COMPUTING AND BIG DATA, 2013, 23 : 212 - 230
  • [7] A Novel Application of Hoeffding's Inequality to Decision Trees Construction for Data Streams
    Duda, Piotr
    Jaworski, Maciej
    Pietruczuk, Lena
    Rutkowski, Leszek
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 3324 - 3330
  • [8] MIDAS: Open-source framework for distributed online analysis of data streams
    Henelius, Andreas
    Torniainen, Jar
    SOFTWAREX, 2018, 7 : 156 - 161
  • [9] A data streams analysis strategy based on hoeffding tree with concept drift on Hadoop system
    Song, Xin
    He, Huiyuan
    Niu, Shaokai
    Gao, Jing
    2016 FOURTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD 2016), 2016, : 45 - 48
  • [10] A framework for mediation analysis with massive data
    Zhang, Haixiang
    Li, Xin
    STATISTICS AND COMPUTING, 2023, 33 (04)