Augmented Sketch: Faster and More Accurate Stream Processing

被引:110
|
作者
Roy, Pratanu [1 ,3 ]
Khan, Arijit [2 ]
Alonso, Gustavo [1 ]
机构
[1] Swiss Fed Inst Technol, Syst Grp Comp Sci, Zurich, Switzerland
[2] NTU Singapore, Sch Comp Engn, Singapore, Singapore
[3] Oracle Labs, Zurich, Switzerland
关键词
data streams; sketch; approximated algorithms; data structures; stream summary;
D O I
10.1145/2882903.2882948
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Approximated algorithms are often used to estimate the frequency of items on high volume, fast data streams. The most common ones are variations of Count-Min sketch, which use sub-linear space for the count, but can produce errors in the counts of the most frequent items and can misclassify low-frequency items. In this paper, we improve the accuracy of sketch-based algorithms by increasing the frequency estimation accuracy of the most frequent items and reducing the possible misclassification of low-frequency items, while also improving the overall throughput. Our solution, called Augmented Sketch (ASketch), is based on a pre-filtering stage that dynamically identifies and aggregates the most frequent items. Items overflowing the pre-filtering stage are processed using a conventional sketch algorithm, thereby making the solution general and applicable in a wide range of contexts. The pre-filtering stage can be efficiently implemented with S I MD instructions on multi-core machines and can be further parallelized through pipeline parallelism where the filtering stage runs in one core and the sketch algorithm runs in another core.
引用
收藏
页码:1449 / 1463
页数:15
相关论文
共 50 条
  • [1] HeavySeparation: A Generic framework for stream processing faster and more accurate
    Lu, Jie
    Chen, Hongchang
    Zhang, Zhen
    [J]. COMPUTER COMMUNICATIONS, 2024, 223 : 36 - 43
  • [2] Cold Filter: A Meta-Framework for Faster and More Accurate Stream Processing
    Zhou, Yang
    Yang, Tong
    Jiang, Jie
    Cui, Bin
    Yu, Minlan
    Li, Xiaoming
    Uhlig, Steve
    [J]. SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 741 - 756
  • [3] One Memory Access Sketch: a More Accurate and Faster Sketch for Per-flow Measurement
    Zhou, Yang
    Liu, Peng
    Jin, Hao
    Yang, Tong
    Dang, Shoujiang
    Li, Xiaoming
    [J]. GLOBECOM 2017 - 2017 IEEE GLOBAL COMMUNICATIONS CONFERENCE, 2017,
  • [4] Larger, faster, more accurate
    Mitchell, Jonathan
    [J]. 1600, DMG World Media (UK) Ltd. (177):
  • [5] FASTER AND MORE ACCURATE TESTING
    不详
    [J]. ADVANCED MATERIALS & PROCESSES, 1987, 131 (01): : 72 - &
  • [6] Rhombus sketch: adaptive and more accurate sketch for streaming data
    Wei X.-H.
    Miao Y.-W.
    Wang X.-W.
    [J]. Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2022, 52 (04): : 874 - 884
  • [7] DHS: Adaptive Memory Layout Organization of Sketch Slots for Fast and Accurate Data Stream Processing
    Zhao, Bohan
    Li, Xiang
    Tian, Boyu
    Mei, Zhiyu
    Wu, Wenfei
    [J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 2285 - 2293
  • [9] FASTER AND MORE ACCURATE INDUSTRIAL ROBOTS
    MADESATER, A
    [J]. INDUSTRIAL ROBOT, 1995, 22 (02): : 14 - 15
  • [10] MORE ACCURATE SIMULATIONS AT FASTER RATES
    GREENBERG, DP
    [J]. IEEE COMPUTER GRAPHICS AND APPLICATIONS, 1991, 11 (01) : 23 - 29