Panakos: Chasing the Tails for Multidimensional Data Streams

被引:2
|
作者
Zhao, Fuheng [1 ]
Khan, Punnal Ismail [1 ]
Agrawal, Divyakant [1 ]
El Abbadi, Amr [1 ]
Gupta, Arpit [1 ]
Liu, Zaoxing [2 ]
机构
[1] UC Santa Barbara, Santa Barbara, CA 93106 USA
[2] Boston Univ, Boston, MA 02215 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 16卷 / 06期
关键词
D O I
10.14778/3583140.3583147
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
System operators are often interested in extracting different feature streams from multi-dimensional data streams; and reporting their distributions at regular intervals, including the heavy hitters that contribute to the tail portion of the feature distribution. Satisfying these requirements to increase data rates with limited resources is challenging. This paper presents the design and implementation of Panakos that makes the best use of available resources to report a given feature's distribution accurately, its tail contributors, and other stream statistics (e.g., cardinality, entropy, etc.). Our key idea is to leverage the skewness inherent to most feature streams in the real world. We leverage this skewness by disentangling the feature stream into hot, warm, and cold items based on their feature values. We then use different data structures for tracking objects in each category. Panakos provides solid theoretical guarantees and achieves high performance for various tasks. We have implemented Panakos on both software and hardware and compared Panakos to other state-of-the-art sketches using synthetic and real-world datasets. The experimental results demonstrate that Panakos often achieves one order of magnitude better accuracy than the state-of-the-art solutions for a given memory budget.
引用
收藏
页码:1291 / 1304
页数:14
相关论文
共 50 条
  • [41] Chasing Tails: Cathepsin-L Improves Structural Analysis of Histones by HX-MS
    Papanastasiou, Malvina
    Mullahoo, James
    DeRuff, Katherine C.
    Bajrami, Besnik
    Karageorgos, Ioannis
    Johnston, Stephen E.
    Peckner, Ryan
    Myers, Samuel A.
    Carr, Steven A.
    Jaffe, Jacob D.
    MOLECULAR & CELLULAR PROTEOMICS, 2019, 18 (10) : 2089 - 2098
  • [42] Ablation of Persistent AF Have We Come Full Circle, or Are We Chasing Our Tails?
    Wright, Matthew
    JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2015, 66 (24) : 2753 - 2756
  • [43] Multidimensional data
    Martin Krzywinski
    Erica Savig
    Nature Methods, 2013, 10 : 595 - 595
  • [44] Simulation data as data streams
    Abdulla, G
    Critchlow, T
    Arrighi, W
    SIGMOD RECORD, 2004, 33 (01) : 89 - 94
  • [45] Exploring a multidimensional concept of loss chasing using online sports betting records
    Edson, Timothy C.
    Louderback, Eric R.
    Tom, Matthew A.
    Mccullock, Seth P.
    Laplante, Debi A.
    INTERNATIONAL GAMBLING STUDIES, 2024, 24 (02) : 306 - 324
  • [46] Maximum likelihood fitting of tidal streams with application to the Sagittarius dwarf tidal tails
    Cole, Nathan
    Newberg, Heidi Jo
    Magdon-Ismail, Malik
    Desell, Travis
    Dawsey, Kristopher
    Hayashi, Warren
    Liu, Xinyang Fred
    Purnell, Jonathan
    Szymanski, Boleslaw
    Varela, Carlos
    Willett, Benjamin
    Wisniewski, James
    ASTROPHYSICAL JOURNAL, 2008, 683 (02): : 750 - 766
  • [47] Chasing Tails: Insights From Micromagnetic Modeling for Thermomagnetic Recording in Non-Uniform Magnetic Structures
    Nagy, Lesleis
    Williams, Wyn
    Tauxe, Lisa
    Muxworthy, Adrian
    GEOPHYSICAL RESEARCH LETTERS, 2022, 49 (23)
  • [48] Chasing the urmetazoon: Striking a blow for quality data?
    Osigus, Hans-Juergen
    Eitel, Michael
    Schierwater, Bernd
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2013, 66 (02) : 551 - 557
  • [50] Pulsed data streams
    Kopetz, Hermann
    From Model-Driven Design to Resource Management for Distributed Embedded Systems, 2006, 225 : 105 - 114