A Parallel Frequent Item Counting Algorithm

被引:2
|
作者
Yang, Xun [1 ,2 ]
Liu, Jun [1 ,2 ]
Zhou, Wenli [1 ,2 ,3 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing Key Lab Network Syst Architecture & Conve, Beijing, Peoples R China
[2] Beijing Univ Posts & Telecommun, Ctr Data Sci, Beijing, Peoples R China
[3] HAOHAN Data Technol Co Ltd, Beijing, Peoples R China
关键词
frequent items; parallel algorithms; stream processing;
D O I
10.1109/IHMSC.2016.123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent items in high-speed streaming data are important to many applications like network monitoring and anomaly detecting. To deal with high arrival rate of streaming data, it is desirable that such systems be capable of supporting high processing throughput with tight guarantees on errors. In this paper, we address the problem of finding frequent and top-k items, and present a parallel version of the Space Saving algorithm in the context of the open source distributed computing system. Based on the theoretical analysis, the errors are restrictively bounded in our algorithm, and our parallel design could achieve high throughput. Taking advantage of the distributed computing resources, our evaluation reveals that such design delivers linear speedup with remarkable scalability.
引用
收藏
页码:225 / 230
页数:6
相关论文
共 50 条
  • [1] An efficient parallel and distributed algorithm for counting frequent sets
    Orlando, S
    Palmerini, P
    Perego, R
    Silvestri, F
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2002, 2003, 2565 : 421 - 435
  • [2] OPTIMIZATION AND REALIZATION OF PARALLEL FREQUENT ITEM SET MINING ALGORITHM
    Yuan, Ling
    Li, Dan
    Chen, Yuzhong
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, : 546 - 551
  • [3] An improved parallel algorithm for finding frequent item-sets
    She, CD
    Li, L
    Wang, HB
    Gao, B
    Deng, HQ
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON INTELLIGENT MECHATRONICS AND AUTOMATION, 2004, : 383 - 386
  • [4] Parallel algorithm for mining frequent item sets based on Spark
    Mao Y.
    Wu B.
    Xu C.
    Zhang M.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (04): : 1267 - 1283
  • [5] Parallel frequent set counting
    Skillicorn, DB
    PARALLEL COMPUTING, 2002, 28 (05) : 815 - 825
  • [6] Parallel Frequent Item Set Mining with Selective Item Replication
    Ozkural, Eray
    Ucar, Bora
    Aykanat, Cevdet
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (10) : 1632 - 1640
  • [7] An efficient framework for parallel and continuous frequent item monitoring
    Zhang, Yu
    Sun, Yue
    Zhang, Jianzhong
    Xu, Jingdong
    Wu, Ying
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (18): : 2856 - 2879
  • [8] Study on the Discovery Algorithm of the Frequent Item Sets
    Cheng, Huifeng
    Ma, Yanli
    Li, Fangping
    2009 INTERNATIONAL ASIA SYMPOSIUM ON INTELLIGENT INTERACTION AND AFFECTIVE COMPUTING, 2009, : 172 - +
  • [9] Proposed algorithm for frequent item set generation
    Singh, Archana
    Agarwal, Jyoti
    2014 SEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2014, : 160 - 165
  • [10] A PARALLEL REFERENCE COUNTING ALGORITHM
    KAKUTA, K
    NAKAMURA, H
    IIDA, S
    INFORMATION PROCESSING LETTERS, 1986, 23 (01) : 33 - 37