Maintaining stream statistics over sliding windows

被引:255
|
作者
Datar, M [1 ]
Gionis, A
Indyk, P
Motwani, R
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] MIT, Comp Sci Lab, Cambridge, MA 02139 USA
关键词
statistics; data streams; sliding windows; approximation algorithms;
D O I
10.1137/S0097539701398363
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We consider the following basic problem: Given a stream of bits, maintain a count of the number of 1 s in the last N elements seen from the stream. We show that, using O(1/epsilon log(2) N) bits of memory, we can estimate the number of 1 s to within a factor of 1 + epsilon. We also give a matching lower bound of Omega(1/epsilon log(2) N) memory bits for any deterministic or randomized algorithms. We extend our scheme to maintain the sum of the last N positive integers and provide matching upper and lower bounds for this more general problem as well. We also show how to efficiently compute the L-p norms (p is an element of[1, 2]) of vectors in the sliding window model using our techniques. Using our algorithm, one can adapt many other techniques to work for the sliding window model with a multiplicative overhead of O(1/epsilon log N) in memory and a 1 + epsilon factor loss in accuracy. These include maintaining approximate histograms, hash tables, and statistics or aggregates such as sum and averages.
引用
收藏
页码:1794 / 1813
页数:20
相关论文
共 50 条
  • [31] Adaptive correlation analysis in stream time series with sliding windows
    Zhang, Tiancheng
    Yue, Dejun
    Gu, Yu
    Wang, Yi
    Yu, Ge
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2009, 57 (06) : 937 - 948
  • [32] Geometric optimization problems over sliding windows
    Chan, TM
    Sadjad, BS
    [J]. ALGORITHMS AND COMPUTATION, 2004, 3341 : 246 - 258
  • [33] Sliding Sketches: A Framework using Time Zones for Data Stream Processing in Sliding Windows
    Gou, Xiangyang
    He, Long
    Zhang, Yinda
    Wang, Ke
    Liu, Xilai
    Yang, Tong
    Wang, Yi
    Cui, Bin
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1015 - 1025
  • [34] Processing sliding windows over disordered streams
    Kim, Hyeon Gyu
    Kim, Myoung Ho
    [J]. 2008 THE INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, 2008, : 25 - +
  • [35] Geometric optimization problems over sliding windows
    Chan, Timothy M.
    Sadjad, Bashir S.
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2006, 16 (2-3) : 145 - 157
  • [36] Counting Distinct Elements over Sliding Windows
    Assaf, Eran
    Ben Basat, Ran
    Einziger, Gil
    Friedman, Roy
    Kassner, Yaron
    [J]. SYSTOR'17: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, 2017,
  • [37] Reverse Skyline Computation over Sliding Windows
    Xin, Junchang
    Wang, Zhiqiong
    Bai, Mei
    Wang, Guoren
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [38] Optimal Matrix Sketching over Sliding Windows
    Yin, Hanyan
    Wen, Dongxie
    Li, Jiajun
    Wei, Zhewei
    Zhang, Xiao
    Huang, Zengfeng
    Li, Feifei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (09): : 2149 - 2161
  • [39] Sliding windows over uncertain data streams
    Dallachiesa, Michele
    Jacques-Silva, Gabriela
    Gedik, Bugra
    Wu, Kun-Lung
    Palpanas, Themis
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (01) : 159 - 190
  • [40] Sliding windows over uncertain data streams
    Michele Dallachiesa
    Gabriela Jacques-Silva
    Buğra Gedik
    Kun-Lung Wu
    Themis Palpanas
    [J]. Knowledge and Information Systems, 2015, 45 : 159 - 190