Conditional heavy hitters: detecting interesting correlations in data streams

被引:0
|
作者
Katsiaryna Mirylenka
Graham Cormode
Themis Palpanas
Divesh Srivastava
机构
[1] The University of Trento,
[2] The University of Warwick,undefined
[3] Paris Descartes University,undefined
[4] AT&T Labs,undefined
来源
The VLDB Journal | 2015年 / 24卷
关键词
Streaming data; Online algorithms; Heavy hitters;
D O I
暂无
中图分类号
学科分类号
摘要
The notion of heavy hitters—items that make up a large fraction of the population—has been successfully used in a variety of applications across sensor and RFID monitoring, network data analysis, event mining, and more. Yet this notion often fails to capture the semantics we desire when we observe data in the form of correlated pairs. Here, we are interested in items that are conditionally frequent: when a particular item is frequent within the context of its parent item. In this work, we introduce and formalize the notion of conditional heavy hitters to identify such items, with applications in network monitoring and Markov chain modeling. We explore the relationship between conditional heavy hitters and other related notions in the literature, and show analytically and experimentally the usefulness of our approach. We introduce several algorithm variations that allow us to efficiently find conditional heavy hitters for input data with very different characteristics, and provide analytical results for their performance. Finally, we perform experimental evaluations with several synthetic and real datasets to demonstrate the efficacy of our methods and to study the behavior of the proposed algorithms for different types of data.
引用
收藏
页码:395 / 414
页数:19
相关论文
共 50 条
  • [31] Identifying correlated heavy-hitters in a two-dimensional data stream
    Lahiri, Bibudh
    Mukherjee, Arko Provo
    Tirthapura, Srikanta
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (04) : 797 - 818
  • [32] Detecting anomaly in data streams by fractal model
    Zhang, Rong
    Zhou, Minqi
    Gong, Xueqing
    He, Xiaofeng
    Qian, Weining
    Qin, Shouke
    Zhou, Aoying
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2015, 18 (05): : 1419 - 1441
  • [33] Detecting anomaly in data streams by fractal model
    Rong Zhang
    Minqi Zhou
    Xueqing Gong
    Xiaofeng He
    Weining Qian
    Shouke Qin
    Aoying Zhou
    [J]. World Wide Web, 2015, 18 : 1419 - 1441
  • [34] Adaptively detecting aggregation bursts in data streams
    Zhou, AY
    Qin, SK
    Qian, WN
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2005, 3453 : 435 - 446
  • [35] Detecting concept change in dynamic data streams
    Pears, Russel
    Sakthithasan, Sripirakas
    Koh, Yun Sing
    [J]. MACHINE LEARNING, 2014, 97 (03) : 259 - 293
  • [36] A Heaviest Hitters Limiting Mechanism with O(1) Time Complexity for Sliding-Window Data Streams
    Koutsiamanis, Remous-Aris
    Efraimidis, Pavlos S.
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2013, 14 (01): : 117 - 126
  • [37] Locally Private Set-Valued Data Analyses: Distribution and Heavy Hitters Estimation
    Wang, Shaowei
    Li, Yuntong
    Zhong, Yusen
    Chen, Kongyang
    Wang, Xianmin
    Zhou, Zhili
    Peng, Fei
    Qian, Yuqiu
    Du, Jiachun
    Yang, Wei
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (08) : 8050 - 8065
  • [38] Detecting Divergent Subpopulations in Phenomics Data using Interesting Flares
    Kamruzzaman, Methun
    Kalyanaraman, Ananth
    Krishnamoorthy, Bala
    [J]. ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 155 - 164
  • [39] Detecting Changes in Unlabeled Data Streams using Martingale
    Ho, Shen-Shyang
    Wechsler, Harry
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1912 - 1917
  • [40] Detecting the Change of Clustering Structure in Categorical Data Streams
    Chen, Keke
    Liu, Ling
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 504 - 508