A Parallel Frequent Item Counting Algorithm

被引:2
|
作者
Yang, Xun [1 ,2 ]
Liu, Jun [1 ,2 ]
Zhou, Wenli [1 ,2 ,3 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing Key Lab Network Syst Architecture & Conve, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Ctr Data Sci, Beijing, Peoples R China
[3] HAOHAN Data Technol Co Ltd, Beijing, Peoples R China
来源
2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 2 | 2016年
关键词
frequent items; parallel algorithms; stream processing;
D O I
10.1109/IHMSC.2016.123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent items in high-speed streaming data are important to many applications like network monitoring and anomaly detecting. To deal with high arrival rate of streaming data, it is desirable that such systems be capable of supporting high processing throughput with tight guarantees on errors. In this paper, we address the problem of finding frequent and top-k items, and present a parallel version of the Space Saving algorithm in the context of the open source distributed computing system. Based on the theoretical analysis, the errors are restrictively bounded in our algorithm, and our parallel design could achieve high throughput. Taking advantage of the distributed computing resources, our evaluation reveals that such design delivers linear speedup with remarkable scalability.
引用
收藏
页码:225 / 230
页数:6
相关论文
共 50 条
  • [41] Mining Frequent Sequential Rules with An Efficient Parallel Algorithm
    Youssef, Nesma
    Abdulkader, Hatem
    Abdelwahab, Amira
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (01) : 110 - 120
  • [42] YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark
    Qiu, Hongjian
    Gu, Rong
    Yuan, Chunfeng
    Huang, Yihua
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1664 - 1671
  • [43] A New Parallel Algorithm for the Frequent Itemset Mining Problem
    Craus, Mitica
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, 2008, : 165 - 170
  • [44] Enhanced parallel mining algorithm for frequent sequential rules
    Youssef, Nesma
    Abdulkader, Hatem
    Abdelwahab, Amira
    AIN SHAMS ENGINEERING JOURNAL, 2022, 13 (01)
  • [45] PNPFI: An Efficient Parallel Frequent Itemsets Mining Algorithm
    Zhang, Fang
    Zhang, Yu
    Liao, Xiaofei
    Jin, Hai
    PROCEEDINGS OF THE 2018 IEEE 22ND INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN ((CSCWD)), 2018, : 172 - 177
  • [46] Discovering frequent parallel episodes in complex event sequences by counting distinct occurrences
    Oualid Ouarem
    Farid Nouioua
    Philippe Fournier-Viger
    Applied Intelligence, 2024, 54 : 701 - 721
  • [47] Discovering frequent parallel episodes in complex event sequences by counting distinct occurrences
    Ouarem, Oualid
    Nouioua, Farid
    Fournier-Viger, Philippe
    APPLIED INTELLIGENCE, 2024, 54 (01) : 701 - 721
  • [48] Algorithm based on counting for mining frequent items over data stream
    Zhu, Ranwei
    Wang, Peng
    Liu, Majin
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2011, 48 (10): : 1803 - 1811
  • [49] SIBA: A Fast Frequent Item Sets Mining Algorithm Based on Sampling and Improved Bat Algorithm
    Wei Ying
    Huang Jian
    Zhang Zhongjie
    Kong Jiangtao
    2015 CHINESE AUTOMATION CONGRESS (CAC), 2015, : 64 - 69
  • [50] PATRIC: A Parallel Algorithm for Counting Triangles in Massive Networks
    Arifuzzaman, Shaikh
    Khan, Maleq
    Marathe, Madhav
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 529 - 538