Probabilistic lossy counting: An efficient algorithm for finding heavy hitters

被引:0
|
作者
Dimitropoulos, Xenofontas [1 ]
Hurley, Paul [1 ]
Kind, Andreas [1 ]
机构
[1] IBM Zurich Res Lab, Zurich, Switzerland
关键词
algorithms; measurement; performance; heavy hitters; data streams;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge of the largest traffic flows in a network is important for many network management applications. The problem of finding these flows is known as the heavy-hitter problem and has been the subject of many studies in the past years. One of the most efficient and well-known algorithms for finding heavy hitters is lossy counting [29]. In this work we introduce probabilistic lossy counting (PLC) which enhances lossy counting in computing network traffic heavy hitters. PLC uses on a tighter error bound on the estimated sizes of traffic flows and provides probabilistic rather than deterministic guarantees on its accuracy. The probabilistic-based error bound substantially improves the memory consumption of the algorithm. In addition, PLC reduces the rate of false positives of lossy counting and achieves a low estimation error, although slightly higher than that of lossy counting. We compare PLC with state-of-the-art algorithms for finding heavy hitters. Our experiments using real traffic traces find that PLC has 1) between 34.4% and 74% lower memory consumption, 2) between 37.9% and 40.5% fewer false positives than lossy counting, and 3) a small estimation error.
引用
收藏
页码:7 / 16
页数:10
相关论文
共 50 条
  • [1] Mnemonic Lossy Counting: An Efficient and Accurate Heavy-hitters Identification Algorithm
    Rong, Qiong
    Zhang, Guangxing
    Xie, Gaogang
    Salamatian, Kave
    [J]. 2010 IEEE 29TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2010, : 255 - 262
  • [2] Efficient Algorithms for Finding Approximate Heavy Hitters in Personalized PageRanks
    Wang, Sibo
    Tao, Yufei
    [J]. SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1113 - 1127
  • [3] Finding Heavy Distinct Hitters in Data Streams
    Locher, Thomas
    [J]. SPAA 11: PROCEEDINGS OF THE TWENTY-THIRD ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2011, : 299 - 308
  • [4] Finding Interesting Correlations with Conditional Heavy Hitters
    Mirylenka, Katsiaryna
    Palpanas, Themis
    Cormode, Graham
    Srivastava, Divesh
    [J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1069 - 1080
  • [5] Finding Subcube Heavy Hitters in Analytics Data Streams
    Kveton, Branislav
    Muthukrishnan, S.
    Vu, Hoa T.
    Xian, Yikun
    [J]. WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, : 1705 - 1714
  • [6] Finding Heavy Hitters by Packet Count Flow Sampling
    Zhu, Zhuyang
    Zhang, Hai
    Guo, Wenming
    [J]. ICCEE 2008: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, 2008, : 834 - 838
  • [7] Finding Hierarchical Heavy Hitters in Network Measurement System
    Li, Yunqi
    Yang, Jiahai
    An, Changqing
    Zhang, Hui
    [J]. APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 232 - 236
  • [8] Finding Correlated Heavy-Hitters over Data Streams
    Lahiri, Bibudh
    Tirthapura, Srikanta
    [J]. 2009 IEEE 28TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCC 2009), 2009, : 307 - 314
  • [9] Identification of Heavy Hitters for Network Data Streams with Probabilistic Sketch
    Zhou, Aiping
    Zhu, Huisheng
    Liu, Lijun
    Zhu, Chengang
    [J]. 2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2018, : 451 - 456
  • [10] Memento: Making Sliding Windows Efficient for Heavy Hitters
    Ben Basat, Ran
    Einziger, Gil
    Keslassy, Isaac
    Orda, Ariel
    Vargaftik, Shay
    Waisbard, Erez
    [J]. CONEXT'18: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON EMERGING NETWORKING EXPERIMENTS AND TECHNOLOGIES, 2018, : 254 - 266