Efficiently Summarizing Data Streams over Sliding Windows

被引:9
|
作者
Rivetti, Nicolo [1 ]
Busnel, Yann [2 ]
Mostefaoui, Achour [1 ]
机构
[1] Univ Nantes, LINA, Nantes, France
[2] Inria, Crest Ensai, Rennes, France
关键词
D O I
10.1109/NCA.2015.46
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Estimating the frequency of any piece of information in large-scale distributed data streams became of utmost importance in the last decade (e.g., in the context of network monitoring, big data, etc.). If some elegant solutions have been proposed recently, their approximation is computed from the inception of the stream. In a runtime distributed context, one would prefer to gather information only about the recent past. This may be led by the need to save resources or by the fact that recent information is more relevant. In this paper, we consider the sliding window model and propose two different (on-line) algorithms that approximate the items frequency in the active window. More precisely, we determine a (epsilon, delta)-additive-approximation meaning that the error is greater than epsilon only with probability delta. These solutions use a very small amount of memory with respect to the size N of the window and the number n of distinct items of the stream, namely, O(1/epsilon log 1/delta (log N + log n)) and O(1/tau epsilon log 1/delta (log N + log n)) bits of space, where tau is a parameter limiting memory usage. We also provide their distributed variant, i.e., considering the sliding window functional monitoring model. We compared the proposed algorithms to each other and also to the state of the art through extensive experiments on synthetic traces and real data sets that validate the robustness and accuracy of our algorithms.
引用
收藏
页码:151 / 158
页数:8
相关论文
共 50 条
  • [41] An EM-Based Algorithm for Clustering Data Streams in Sliding Windows
    Dang, Xuan Hong
    Lee, Vincent
    Ng, Wee Keong
    Ciptadi, Arridhang
    Ong, Kok Leong
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 230 - +
  • [42] RLC: ranking lag correlations with flexible sliding windows in data streams
    Wu, Shanshan
    Lin, Huaizhong
    Wang, Wenxiang
    Lu, Dongming
    U, Leong Hou
    Gao, Yunjun
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2017, 20 (02) : 601 - 611
  • [43] STAGGER: Periodicity mining of data streams using expanding sliding windows
    Elfeky, Mohamed G.
    Aref, Walid G.
    Elmagarmid, Ahmed K.
    [J]. ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 188 - +
  • [44] RLC: ranking lag correlations with flexible sliding windows in data streams
    Shanshan Wu
    Huaizhong Lin
    Wenxiang Wang
    Dongming Lu
    Leong Hou U
    Yunjun Gao
    [J]. Pattern Analysis and Applications, 2017, 20 : 601 - 611
  • [45] Distributed Streams Algorithms for Sliding Windows
    Phillip B. Gibbons
    Srikanta Tirthapura
    [J]. Theory of Computing Systems, 2004, 37 : 457 - 478
  • [46] Distributed streams algorithms for sliding windows
    Gibbons, PB
    Tirthapura, S
    [J]. THEORY OF COMPUTING SYSTEMS, 2004, 37 (03) : 457 - 478
  • [47] Heavy Hitters in Streams and Sliding Windows
    Ben-Basat, Ran
    Einziger, Gil
    Friedman, Roy
    Kassner, Yaron
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [48] Mining Frequent Item Sets in Asynchronous Transactional Data Streams over Time Sensitive Sliding Windows Model
    Javaid, Qaisar
    Memon, Farida
    Talpur, Shahnawaz
    Arif, Muhammad
    Awan, Muhammad Daud
    [J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2016, 35 (04) : 625 - 644
  • [49] RETRACTED: Improved Decaying Bloom Filter for Duplicate Detection in Data Streams Over Sliding Windows (Retracted Article)
    Wang, Xiujun
    Shen, Hong
    [J]. ICCSIT 2010 - 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 4, 2010, : 348 - 353
  • [50] Regular Expression Pattern Matching with Sliding Windows over Probabilistic Event Streams
    Sugiura, Kento
    Ishikawa, Yoshiharu
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2019, : 103 - 110