Efficiently Summarizing Data Streams over Sliding Windows

被引:9
|
作者
Rivetti, Nicolo [1 ]
Busnel, Yann [2 ]
Mostefaoui, Achour [1 ]
机构
[1] Univ Nantes, LINA, Nantes, France
[2] Inria, Crest Ensai, Rennes, France
关键词
D O I
10.1109/NCA.2015.46
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Estimating the frequency of any piece of information in large-scale distributed data streams became of utmost importance in the last decade (e.g., in the context of network monitoring, big data, etc.). If some elegant solutions have been proposed recently, their approximation is computed from the inception of the stream. In a runtime distributed context, one would prefer to gather information only about the recent past. This may be led by the need to save resources or by the fact that recent information is more relevant. In this paper, we consider the sliding window model and propose two different (on-line) algorithms that approximate the items frequency in the active window. More precisely, we determine a (epsilon, delta)-additive-approximation meaning that the error is greater than epsilon only with probability delta. These solutions use a very small amount of memory with respect to the size N of the window and the number n of distinct items of the stream, namely, O(1/epsilon log 1/delta (log N + log n)) and O(1/tau epsilon log 1/delta (log N + log n)) bits of space, where tau is a parameter limiting memory usage. We also provide their distributed variant, i.e., considering the sliding window functional monitoring model. We compared the proposed algorithms to each other and also to the state of the art through extensive experiments on synthetic traces and real data sets that validate the robustness and accuracy of our algorithms.
引用
收藏
页码:151 / 158
页数:8
相关论文
共 50 条
  • [1] Sliding windows over uncertain data streams
    Michele Dallachiesa
    Gabriela Jacques-Silva
    Buğra Gedik
    Kun-Lung Wu
    Themis Palpanas
    [J]. Knowledge and Information Systems, 2015, 45 : 159 - 190
  • [2] Sliding windows over uncertain data streams
    Dallachiesa, Michele
    Jacques-Silva, Gabriela
    Gedik, Bugra
    Wu, Kun-Lung
    Palpanas, Themis
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (01) : 159 - 190
  • [3] Sketching asynchronous data streams over sliding windows
    Bojian Xu
    Srikanta Tirthapura
    Costas Busch
    [J]. Distributed Computing, 2008, 20 : 359 - 374
  • [4] On indexing sliding windows over online data streams
    Golab, L
    Garg, S
    Özsu, MT
    [J]. ADVANCES IN DATABASE TECHNOLOGY - EDBT 2004, PROCEEDINGS, 2004, 2992 : 712 - 729
  • [5] Clustering Data Streams over Sliding Windows by DCA
    Ta Minh Thuy
    Le Thi Hoai An
    Boudjeloud-Assala, Lydia
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 65 - 75
  • [6] Sketching asynchronous data streams over sliding windows
    Xu, Bojian
    Tirthapura, Srikanta
    Busch, Costas
    [J]. DISTRIBUTED COMPUTING, 2008, 20 (05) : 359 - 374
  • [7] Dynamic adjustment of sliding windows over data streams
    Zhang, DD
    Li, JZ
    Zhang, ZG
    Wang, WP
    Guo, LJ
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 24 - 33
  • [8] Tracking clusters in evolving data streams over sliding windows
    Zhou, Aoying
    Cao, Feng
    Qian, Weining
    Jin, Cheqing
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 15 (02) : 181 - 214
  • [9] Querying sliding windows over on-line data streams
    Golab, L
    [J]. CURRENT TRENDS IN DATABASE TECHNOLOGY - EDBT 2004 WORKSHOPS, PROCEEDINGS, 2004, 3268 : 1 - 11
  • [10] Random sampling algorithms for sliding windows over data streams
    Zhang, LB
    Li, ZH
    Yu, M
    Wang, Y
    Jiang, Y
    [J]. PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 572 - 575