Stream Aggregation with Compressed Sliding Windows

被引:2
|
作者
Geethakumari, Prajith Ramakrishnan [1 ]
Sourdis, Ioannis [1 ]
机构
[1] Chalmers Univ Technol, Comp Sci & Engn Dept, Rannvagen 6, S-41296 Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
Compression; dataflow; aggregation; sliding windows; stream processing; SYSTEM;
D O I
10.1145/3590774
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High performance stream aggregation is critical for many emerging applications that analyze massive volumes of data. Incoming data needs to be stored in a sliding window during processing, in case the aggregation functions cannot be computed incrementally. Updating the window with new incoming values and reading it to feed the aggregation functions are the two primary steps in stream aggregation. Although window updates can be supported efficiently using multi-level queues, frequent window aggregations remain a performance bottleneck as they put tremendous pressure on the memory bandwidth and capacity. This article addresses this problem by enhancing StreamZip, a dataflow stream aggregation engine that is able to compress the sliding windows. StreamZip deals with a number of data and control dependency challenges to integrate a compressor in the stream aggregation pipeline and alleviate the memory pressure posed by frequent aggregations. In addition, StreamZip incorporates a caching mechanism for dealing with skewed-key distributions in the incoming data stream. In doing so, StreamZip offers higher throughput as well as larger effective window capacity to support larger problems. StreamZip supports diverse compression algorithms offering both lossless and lossy compression to integers as well as floating-point numbers. Compared to designs without compression, StreamZip lossless and lossy designs achieve up to 7.5x and 22x higher throughput, while improving the effective memory capacity by up to 5x and 23x, respectively.
引用
下载
收藏
页数:28
相关论文
共 50 条
  • [21] Stream-based active learning for sliding windows under the influence of verification latency
    Tuan Pham
    Daniel Kottke
    Georg Krempl
    Bernhard Sick
    Machine Learning, 2022, 111 : 2011 - 2036
  • [22] Efficient Data Stream Clustering with Sliding Windows based on Locality-Sensitive Hashing
    Youn, Jonghem
    Shim, Junho
    Lee, Sang-Goo
    IEEE ACCESS, 2018, 6 : 63757 - 63776
  • [23] Stream-based active learning for sliding windows under the influence of verification latency
    Pham, Tuan
    Kottke, Daniel
    Krempl, Georg
    Sick, Bernhard
    MACHINE LEARNING, 2022, 111 (06) : 2011 - 2036
  • [24] Per-flow Counting for Big Network Data Stream over Sliding Windows
    Zhou, You
    Zhou, Yian
    Chen, Shigang
    Zhang, Youlin
    2017 IEEE/ACM 25TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2017,
  • [25] CSPMDS-PREFIXSPAN: MINING CLOSED SEQUENTIAL PATTERNS OVER DATA STREAM SLIDING WINDOWS
    Zeng, Qiang
    Han, Gaowei
    Chen, Dengxi
    Ren, Jiadong
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2014, 10 (03): : 933 - 946
  • [27] Incremental mining of closed inter-transaction itemsets over data stream sliding windows
    Chiu, Shih-Chuan
    Li, Hua-Fu
    Huang, Jiun-Long
    You, Hsin-Han
    JOURNAL OF INFORMATION SCIENCE, 2011, 37 (02) : 208 - 220
  • [28] A PWF Smoothing Algorithm for K-Sensitive Stream Mining Technologies over Sliding Windows
    Wang, Ling
    Qu, Zhao Yang
    Zhou, Tie Hua
    Yu, Xiu Ming
    Ryu, Keun Ho
    COMPUTATIONAL COLLECTIVE INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS, ICCCI 2014, 2014, 8733 : 504 - 514
  • [29] Steganography in compressed video stream
    Xu, Changyong
    Ping, Xijian
    Zhang, Tao
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 1, PROCEEDINGS, 2006, : 269 - +
  • [30] Automata Theory on Sliding Windows
    Ganardi, Moses
    Hucke, Danny
    Koenig, Daniel
    Lohrey, Markus
    Mamouras, Konstantinos
    35TH SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2018), 2018, 96