Efficiently Summarizing Data Streams over Sliding Windows

被引:9
|
作者
Rivetti, Nicolo [1 ]
Busnel, Yann [2 ]
Mostefaoui, Achour [1 ]
机构
[1] Univ Nantes, LINA, Nantes, France
[2] Inria, Crest Ensai, Rennes, France
关键词
D O I
10.1109/NCA.2015.46
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Estimating the frequency of any piece of information in large-scale distributed data streams became of utmost importance in the last decade (e.g., in the context of network monitoring, big data, etc.). If some elegant solutions have been proposed recently, their approximation is computed from the inception of the stream. In a runtime distributed context, one would prefer to gather information only about the recent past. This may be led by the need to save resources or by the fact that recent information is more relevant. In this paper, we consider the sliding window model and propose two different (on-line) algorithms that approximate the items frequency in the active window. More precisely, we determine a (epsilon, delta)-additive-approximation meaning that the error is greater than epsilon only with probability delta. These solutions use a very small amount of memory with respect to the size N of the window and the number n of distinct items of the stream, namely, O(1/epsilon log 1/delta (log N + log n)) and O(1/tau epsilon log 1/delta (log N + log n)) bits of space, where tau is a parameter limiting memory usage. We also provide their distributed variant, i.e., considering the sliding window functional monitoring model. We compared the proposed algorithms to each other and also to the state of the art through extensive experiments on synthetic traces and real data sets that validate the robustness and accuracy of our algorithms.
引用
收藏
页码:151 / 158
页数:8
相关论文
共 50 条
  • [21] Construction of summary structures from sliding windows over data streams
    Zhang, Longbo
    Li, Zhanhuai
    Yu, Min
    Liu, Shushu
    Jiang, Yun
    [J]. Journal of Computational Information Systems, 2007, 3 (03): : 1215 - 1222
  • [22] HCLUWIN: AN ALGORITHM FOR CLUSTERING HETEROGENEOUS DATA STREAMS OVER SLIDING WINDOWS
    Ren, Jiadong
    Hu, Changzhen
    Ma, Ruiqing
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (05): : 2171 - 2179
  • [23] A deterministic algorithm for summarizing asynchronous streams over a sliding window
    Busch, Costas
    Tirthapura, Srikanta
    [J]. STACS 2007, PROCEEDINGS, 2007, 4393 : 465 - +
  • [24] Processing sliding windows over disordered streams
    Kim, Hyeon Gyu
    Kim, Myoung Ho
    [J]. 2008 THE INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, 2008, : 25 - +
  • [25] Finding frequent items in sliding windows over data streams using EBF
    Wang, ShuYun
    Xu, HeXiang
    Hu, YunFa
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 682 - +
  • [26] Queueing Analysis of Continuous Queries for Uncertain Data Streams Over Sliding Windows
    Xiao, Guoqing
    Li, Kenli
    Zhou, Xu
    Li, Keqin
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (09)
  • [27] Mining Swarm Patterns in Sliding Windows over Moving Object Data Streams
    Bhushan, Alka
    Bellur, Umesh
    Sharma, Kuldeep
    Deshpande, Srijay
    Sarda, Nandlal L.
    [J]. 25TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2017), 2017,
  • [28] Summarizing order statistics over data streams with duplicates
    Zhang, Ying
    Lin, Xuemin
    Yuan, Yidong
    Kitsuregawa, Masaru
    Zhou, Xiaofang
    Yu, Jeffrey Xu
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 1304 - +
  • [29] Approximate Range Emptiness in Constant Time for IoT Data Streams over Sliding Windows
    Wang, Xiujun
    Liu, Zhi
    Yang, Yangzhao
    Shao, Xun
    Gu, Yu
    Ishihara, Susumu
    [J]. 2019 28TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND NETWORKS (ICCCN), 2019,