Maintaining stream statistics over sliding windows

被引:255
|
作者
Datar, M [1 ]
Gionis, A
Indyk, P
Motwani, R
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] MIT, Comp Sci Lab, Cambridge, MA 02139 USA
关键词
statistics; data streams; sliding windows; approximation algorithms;
D O I
10.1137/S0097539701398363
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We consider the following basic problem: Given a stream of bits, maintain a count of the number of 1 s in the last N elements seen from the stream. We show that, using O(1/epsilon log(2) N) bits of memory, we can estimate the number of 1 s to within a factor of 1 + epsilon. We also give a matching lower bound of Omega(1/epsilon log(2) N) memory bits for any deterministic or randomized algorithms. We extend our scheme to maintain the sum of the last N positive integers and provide matching upper and lower bounds for this more general problem as well. We also show how to efficiently compute the L-p norms (p is an element of[1, 2]) of vectors in the sliding window model using our techniques. Using our algorithm, one can adapt many other techniques to work for the sliding window model with a multiplicative overhead of O(1/epsilon log N) in memory and a 1 + epsilon factor loss in accuracy. These include maintaining approximate histograms, hash tables, and statistics or aggregates such as sum and averages.
引用
收藏
页码:1794 / 1813
页数:20
相关论文
共 50 条
  • [41] Visibly Pushdown Languages over Sliding Windows
    Ganardi, Moses
    [J]. 36TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2019), 2019,
  • [42] Probabilistic Skyline Operator over Sliding Windows
    Zhang, Wenjie
    Lin, Xuemin
    Zhang, Ying
    Wang, Wei
    Yu, Jeffrey Xu
    [J]. ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 1060 - +
  • [43] Probabilistic skyline operator over sliding windows
    Zhang, Wenjie
    Lin, Xuemin
    Zhang, Ying
    Wang, Wei
    Zhu, Gaoping
    Yu, Jeffrey Xu
    [J]. INFORMATION SYSTEMS, 2013, 38 (08) : 1212 - 1233
  • [44] Maintaining frequent closed itemsets over a sliding window
    James Cheng
    Yiping Ke
    Wilfred Ng
    [J]. Journal of Intelligent Information Systems, 2008, 31 : 191 - 215
  • [45] Efficient Maintaining of Skyline over Probabilistic Data Stream
    Li, Jin-jiu
    Sun, Sheng-li
    Zhu, Yang-yong
    [J]. ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 4, PROCEEDINGS, 2008, : 378 - 382
  • [46] Maintaining frequent closed itemsets over a sliding window
    Cheng, James
    Ke, Yiping
    Ng, Wilfred
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2008, 31 (03) : 191 - 215
  • [47] Space efficient quantile summary for constrained sliding windows on a data stream
    Xu, J
    Lin, XM
    Zhou, XF
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 34 - 44
  • [48] Exploiting Punctuations along with Sliding Windows to Optimize STREAM Data Manager
    Tiwari, Lokesh
    Shahnasser, Hamid
    [J]. NETWORKED DIGITAL TECHNOLOGIES, PT 1, 2010, 87 : 112 - 119
  • [49] Estimating rarity and similarity over data stream windows
    Datar, M
    Muthukrishnan, S
    [J]. ALGORITHMS-ESA 2002, PROCEEDINGS, 2002, 2461 : 323 - 334
  • [50] Tracking Matrix Approximation over Distributed Sliding Windows
    Zhang, Haida
    Huang, Zengfeng
    Wei, Zhewei
    Zhang, Wenjie
    Lin, Xuemin
    [J]. 2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 833 - 844