Conditional heavy hitters: detecting interesting correlations in data streams

被引:0
|
作者
Katsiaryna Mirylenka
Graham Cormode
Themis Palpanas
Divesh Srivastava
机构
[1] The University of Trento,
[2] The University of Warwick,undefined
[3] Paris Descartes University,undefined
[4] AT&T Labs,undefined
来源
The VLDB Journal | 2015年 / 24卷
关键词
Streaming data; Online algorithms; Heavy hitters;
D O I
暂无
中图分类号
学科分类号
摘要
The notion of heavy hitters—items that make up a large fraction of the population—has been successfully used in a variety of applications across sensor and RFID monitoring, network data analysis, event mining, and more. Yet this notion often fails to capture the semantics we desire when we observe data in the form of correlated pairs. Here, we are interested in items that are conditionally frequent: when a particular item is frequent within the context of its parent item. In this work, we introduce and formalize the notion of conditional heavy hitters to identify such items, with applications in network monitoring and Markov chain modeling. We explore the relationship between conditional heavy hitters and other related notions in the literature, and show analytically and experimentally the usefulness of our approach. We introduce several algorithm variations that allow us to efficiently find conditional heavy hitters for input data with very different characteristics, and provide analytical results for their performance. Finally, we perform experimental evaluations with several synthetic and real datasets to demonstrate the efficacy of our methods and to study the behavior of the proposed algorithms for different types of data.
引用
收藏
页码:395 / 414
页数:19
相关论文
共 50 条
  • [1] Conditional heavy hitters: detecting interesting correlations in data streams
    Mirylenka, Katsiaryna
    Cormode, Graham
    Palpanas, Themis
    Srivastava, Divesh
    [J]. VLDB JOURNAL, 2015, 24 (03): : 395 - 414
  • [2] Finding Interesting Correlations with Conditional Heavy Hitters
    Mirylenka, Katsiaryna
    Palpanas, Themis
    Cormode, Graham
    Srivastava, Divesh
    [J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1069 - 1080
  • [3] Finding Heavy Distinct Hitters in Data Streams
    Locher, Thomas
    [J]. SPAA 11: PROCEEDINGS OF THE TWENTY-THIRD ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2011, : 299 - 308
  • [4] On Frequency Estimation and Detection of Heavy Hitters in Data Streams
    Ventruto, Federica
    Pulimeno, Marco
    Cafaro, Massimo
    Epicoco, Italo
    [J]. FUTURE INTERNET, 2020, 12 (09):
  • [5] Finding Subcube Heavy Hitters in Analytics Data Streams
    Kveton, Branislav
    Muthukrishnan, S.
    Vu, Hoa T.
    Xian, Yikun
    [J]. WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, : 1705 - 1714
  • [6] Finding Correlated Heavy-Hitters over Data Streams
    Lahiri, Bibudh
    Tirthapura, Srikanta
    [J]. 2009 IEEE 28TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCC 2009), 2009, : 307 - 314
  • [7] Identification of Heavy Hitters for Network Data Streams with Probabilistic Sketch
    Zhou, Aiping
    Zhu, Huisheng
    Liu, Lijun
    Zhu, Chengang
    [J]. 2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2018, : 451 - 456
  • [8] Heavy Hitters in Streams and Sliding Windows
    Ben-Basat, Ran
    Einziger, Gil
    Friedman, Roy
    Kassner, Yaron
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [9] Separator: Sifting hierarchical heavy hitters accurately from data streams
    Lin, Yuan
    Liu, Hongyan
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2007, 4632 : 170 - +
  • [10] Universal and Accurate Sketch for Estimating Heavy Hitters and Moments in Data Streams
    Xiao, Qingjun
    Cai, Xuyuan
    Qin, Yifei
    Tang, Zhiying
    Chen, Shigang
    Liu, Yu
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (05) : 1919 - 1934