Continuous Monitoring of Distributed Data Streams over a Time-Based Sliding Window

被引:2
|
作者
Ho-Leung Chan
Tak-Wah Lam
Lap-Kei Lee
Hing-Fung Ting
机构
[1] University of Hong Kong,Department of Computer Science
[2] Max-Planck-Institut für Informatik,undefined
来源
Algorithmica | 2012年 / 62卷
关键词
Algorithms; Distributed data streams; Communication; Frequent items; Quantiles;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper we extend the study of algorithms for monitoring distributed data streams from whole data streams to a time-based sliding window. The concern is how to minimize the communication between individual streams and the root, while allowing the root, at any time, to report the global statistics of all streams within a given error bound. This paper presents communication-efficient algorithms for three classical statistics, namely, basic counting, frequent items and quantiles. The worst-case communication cost over a window is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$O(\frac{k}{\varepsilon} \log\frac{\varepsilon N}{k})$\end{document} bits for basic counting, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$O(\frac{k}{\varepsilon} \log\frac{N}{k})$\end{document} words for frequent items and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$O(\frac{k}{\varepsilon^{2}} \log\frac{N}{k})$\end{document} words for quantiles, where k is the number of distributed data streams, N is the total number of items in the streams that arrive or expire in the window, and ε<1 is the given error bound. The performance of our algorithms matches and nearly matches the corresponding lower bounds. We also show how to generalize these results to streams with out-of-order data.
引用
收藏
页码:1088 / 1111
页数:23
相关论文
共 50 条
  • [1] CONTINUOUS MONITORING OF DISTRIBUTED DATA STREAMS OVER A TIME-BASED SLIDING WINDOW
    Chan, Ho-Leung
    Lam, Tak-Wah
    Lee, Lap-Kei
    Ting, Hing-Fung
    27TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2010), 2010, 5 : 179 - 190
  • [2] Continuous Monitoring of Distributed Data Streams over a Time-Based Sliding Window
    Chan, Ho-Leung
    Lam, Tak-Wah
    Lee, Lap-Kei
    Ting, Hing-Fung
    ALGORITHMICA, 2012, 62 (3-4) : 1088 - 1111
  • [3] Sliding Window Top-K Monitoring over Distributed Data Streams
    Lv, Zhijin
    Chen, Ben
    Yu, Xiaohui
    WEB AND BIG DATA, APWEB-WAIM 2017, PT I, 2017, 10366 : 527 - 540
  • [4] Sliding Window Top-K Monitoring over Distributed Data Streams
    Chen B.
    Lv Z.
    Yu X.
    Liu Y.
    Data Science and Engineering, 2017, 2 (4) : 289 - 300
  • [5] GDSW: A General Framework for Distributed Sliding Window over Data Streams
    Chen, Huan
    Wang, Yijie
    Wang, Yuan
    Ma, Xingkong
    2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 729 - 736
  • [6] Semantics and Implementation of Continuous Sliding Window Queries over Data Streams
    Kraemer, Juergen
    Seeger, Bernhard
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2009, 34 (01):
  • [7] Continuous Skyline Monitoring over Distributed Data Streams
    Lu, Hua
    Zhou, Yongluan
    Haustad, Jonas
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2010, 6187 : 565 - +
  • [8] Tracking Distributed Aggregates over Time-Based Sliding Windows
    Cormode, Graham
    Yi, Ke
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, SSDBM 2012, 2012, 7338 : 416 - 430
  • [9] Processing sliding window join aggregate in continuous queries over data streams
    Wang, WP
    Li, JZ
    Zhang, DD
    Guo, LJ
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3255 : 348 - 363
  • [10] A Dynamic Weighted Random Sampling Algorithm on Time-based Sliding Window over Data Stream
    Tang, Da
    Liu, Xiang
    Yue, Qianjin
    2011 INTERNATIONAL CONFERENCE ON COMPUTER, ELECTRICAL, AND SYSTEMS SCIENCES, AND ENGINEERING (CESSE 2011), 2011, : 23 - +