Data stream treatment using sliding windows with MapReduce

被引:0
|
作者
Jose Basgall, Maria [1 ,2 ]
Hasperue, Waldo [2 ]
Naiouf, Marcelo [2 ]
机构
[1] UNLP, CONICET, III LIDI, La Plata, Buenos Aires, Argentina
[2] Univ Nacl La Plata, Fac Informat, Inst Invest Informat III LIDI, La Plata, Buenos Aires, Argentina
来源
关键词
Big Data; MapReduce; Stream Processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window. In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.
引用
收藏
页码:76 / 83
页数:8
相关论文
共 50 条
  • [1] Sliding Sketches: A Framework using Time Zones for Data Stream Processing in Sliding Windows
    Gou, Xiangyang
    He, Long
    Zhang, Yinda
    Wang, Ke
    Liu, Xilai
    Yang, Tong
    Wang, Yi
    Cui, Bin
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1015 - 1025
  • [2] Clustering on Uncertain Data Stream over Sliding Windows
    Tu, Li
    [J]. 2015 THIRD INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA, 2015, : 148 - 152
  • [3] A Sketch Framework for Approximate Data Stream Processing in Sliding Windows
    Gou, Xiangyang
    Zhang, Yinda
    Hu, Zhoujing
    He, Long
    Wang, Ke
    Liu, Xilai
    Yang, Tong
    Wang, Yi
    Cui, Bin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4411 - 4424
  • [4] Space efficient quantile summary for constrained sliding windows on a data stream
    Xu, J
    Lin, XM
    Zhou, XF
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 34 - 44
  • [5] Clustering Algorithm for High Dimensional Data Stream over Sliding Windows
    Liu, Weiguo
    OuYang, Jia
    [J]. TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1537 - 1542
  • [6] Exploiting Punctuations along with Sliding Windows to Optimize STREAM Data Manager
    Tiwari, Lokesh
    Shahnasser, Hamid
    [J]. NETWORKED DIGITAL TECHNOLOGIES, PT 1, 2010, 87 : 112 - 119
  • [7] Mining compressed frequent itemsets over data stream in sliding windows
    Zhao, Li
    Tong, Yongxin
    Yu, Dan
    Ma, Shilong
    Chen, Mengdong
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 1, 2009, : 713 - 717
  • [8] SHE: A Generic Framework for Data Stream Mining over Sliding Windows
    Wu, Yuhan
    Fan, Zhuochen
    Shi, Qilong
    Zhang, Yixin
    Yang, Tong
    Chen, Cheng
    Zhong, Zheng
    Li, Junnan
    Shtul, Ariel
    Tu, Yaofeng
    [J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [9] Stream Aggregation with Compressed Sliding Windows
    Geethakumari, Prajith Ramakrishnan
    Sourdis, Ioannis
    [J]. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2023, 16 (03)
  • [10] Optimizing Cloud MapReduce for Processing Stream Data using Pipelining
    Karve, Rutvik
    Dahiphale, Devendra
    Chhajer, Amit
    [J]. UKSIM FIFTH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS 2011), 2011, : 344 - 349