Distributed Streams Algorithms for Sliding Windows

被引:0
|
作者
Phillip B. Gibbons
Srikanta Tirthapura
机构
[1] Intel Research Pittsburgh,
[2] 417 South Craig Street,undefined
[3] Department of Electrical and Computer Engineering,undefined
[4] Iowa State University,undefined
来源
关键词
Data Stream; Approximation Scheme; Query Time; Deterministic Algorithm; Single Stream;
D O I
暂无
中图分类号
学科分类号
摘要
Massive data sets often arise as physically distributed, parallel data streams, and it is important to estimate various aggregates and statistics on the union of these streams. This paper presents algorithms for estimating aggregate functions over a “sliding window” of the N most recent data items in one or more streams. Our results include: 1. For a single stream,we present the first ε-approximation scheme for the number of 1’s in a sliding window that is optimal in both worst case time and space. We also present the first ε-approximation scheme for the sum of integers in [0..R] in a sliding window that is optimal in both worst case time and space (assuming R is at most polynomial in N). Both algorithms are deterministic and use only logarithmic memory words. 2. In contrast, we show that any deterministic algorithm that estimates, to within a small constant relative error, the number of 1’s (or the sum of integers) in a sliding window on the union of distributed streams requires Ω(N) space. 3. We present the first (randomized) (ε, δ)-approximation scheme for the number of 1’s in a sliding window on the union of distributed streams that uses only logarithmic memory words. We also present the first (ε, δ)-approximation scheme for the number of distinct values in a sliding window on distributed streams that uses only logarithmic memory words. Our results are obtained using a novel family of synopsis data structures called waves.
引用
收藏
页码:457 / 478
页数:21
相关论文
共 50 条
  • [1] Distributed streams algorithms for sliding windows
    Gibbons, PB
    Tirthapura, S
    [J]. THEORY OF COMPUTING SYSTEMS, 2004, 37 (03) : 457 - 478
  • [2] Random sampling algorithms for sliding windows over data streams
    Zhang, LB
    Li, ZH
    Yu, M
    Wang, Y
    Jiang, Y
    [J]. PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 572 - 575
  • [3] Heavy Hitters in Streams and Sliding Windows
    Ben-Basat, Ran
    Einziger, Gil
    Friedman, Roy
    Kassner, Yaron
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [4] Processing sliding windows over disordered streams
    Kim, Hyeon Gyu
    Kim, Myoung Ho
    [J]. 2008 THE INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, 2008, : 25 - +
  • [5] Sliding windows over uncertain data streams
    Dallachiesa, Michele
    Jacques-Silva, Gabriela
    Gedik, Bugra
    Wu, Kun-Lung
    Palpanas, Themis
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (01) : 159 - 190
  • [6] Sliding windows over uncertain data streams
    Michele Dallachiesa
    Gabriela Jacques-Silva
    Buğra Gedik
    Kun-Lung Wu
    Themis Palpanas
    [J]. Knowledge and Information Systems, 2015, 45 : 159 - 190
  • [7] Sketching asynchronous data streams over sliding windows
    Bojian Xu
    Srikanta Tirthapura
    Costas Busch
    [J]. Distributed Computing, 2008, 20 : 359 - 374
  • [8] Sketching asynchronous data streams over sliding windows
    Xu, Bojian
    Tirthapura, Srikanta
    Busch, Costas
    [J]. DISTRIBUTED COMPUTING, 2008, 20 (05) : 359 - 374
  • [9] Truly Perfect Samplers for Data Streams and Sliding Windows
    Jayaram, Rajesh
    Woodruff, David P.
    Zhou, Samson
    [J]. Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2022, : 29 - 40
  • [10] Dynamic adjustment of sliding windows over data streams
    Zhang, DD
    Li, JZ
    Zhang, ZG
    Wang, WP
    Guo, LJ
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 24 - 33