A Parallel Frequent Item Counting Algorithm

被引:2
|
作者
Yang, Xun [1 ,2 ]
Liu, Jun [1 ,2 ]
Zhou, Wenli [1 ,2 ,3 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing Key Lab Network Syst Architecture & Conve, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Ctr Data Sci, Beijing, Peoples R China
[3] HAOHAN Data Technol Co Ltd, Beijing, Peoples R China
来源
2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 2 | 2016年
关键词
frequent items; parallel algorithms; stream processing;
D O I
10.1109/IHMSC.2016.123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent items in high-speed streaming data are important to many applications like network monitoring and anomaly detecting. To deal with high arrival rate of streaming data, it is desirable that such systems be capable of supporting high processing throughput with tight guarantees on errors. In this paper, we address the problem of finding frequent and top-k items, and present a parallel version of the Space Saving algorithm in the context of the open source distributed computing system. Based on the theoretical analysis, the errors are restrictively bounded in our algorithm, and our parallel design could achieve high throughput. Taking advantage of the distributed computing resources, our evaluation reveals that such design delivers linear speedup with remarkable scalability.
引用
收藏
页码:225 / 230
页数:6
相关论文
共 50 条
  • [31] Parallel Frequent Patterns Mining Algorithm on GPU
    Zhou, Jiayi
    Yu, Kun-Ming
    Wu, Bin-Chang
    2010 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
  • [32] Parallel algorithm for mining maximal frequent patterns
    Wang, H
    Xiao, ZT
    Zhang, HJ
    Jiang, SY
    ADVANCED PARALLEL PROCESSING TECHNOLOGIES, PROCEEDINGS, 2003, 2834 : 241 - 248
  • [33] MREclat: an Algorithm for Parallel Mining Frequent Itemsets
    Zhang, Zhigang
    Ji, Genlin
    Tang, Mengmeng
    2013 INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2013, : 177 - 180
  • [34] A Highly Parallel Algorithm for Frequent Itemset Mining
    Mesa, Alejandro
    Feregrino-Uribe, Claudia
    Cumplido, Rene
    Hernandez-Palancar, Jose
    ADVANCES IN PATTERN RECOGNITION, 2010, 6256 : 291 - +
  • [35] Parallel algorithm for mining frequent closed sequences
    Ma, CX
    Li, QH
    AUTONOMOUS INTELLIGENT SYSTEMS: AGENTS AND DATA MINING, PROCEEDINGS, 2005, 3505 : 184 - 192
  • [36] PARALLEL IMPLEMENTATION OF THE BOX COUNTING ALGORITHM IN OPENCL
    Mukundan, Ramakrishnan
    FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 2015, 23 (03)
  • [37] A Parallel Algorithm for Counting Subgraphs in Complex Networks
    Ribeiro, Pedro
    Silva, Fernando
    Lopes, Luis
    BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, 2011, 127 : 380 - +
  • [38] SaM: A Split and Merge Algorithm for Fuzzy Frequent Item Set Mining
    Borgelt, Christian
    Wang, Xiaomeng
    PROCEEDINGS OF THE JOINT 2009 INTERNATIONAL FUZZY SYSTEMS ASSOCIATION WORLD CONGRESS AND 2009 EUROPEAN SOCIETY OF FUZZY LOGIC AND TECHNOLOGY CONFERENCE, 2009, : 968 - 973
  • [39] Efficient algorithm for mining approximate frequent item over data streams
    Wang, Wei-Ping
    Li, Jian-Zhong
    Zhang, Dong-Dong
    Guo, Long-Jiang
    Ruan Jian Xue Bao/Journal of Software, 2007, 18 (04): : 884 - 892
  • [40] A NEW FREQUENT ITEM SET MINING ALGORITHM BASED ON INTERVAL INTERSECTION
    Yungho-Leu
    Utami, Vania
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOL. 2, 2015, : 471 - 477