Finding frequent items in parallel

被引:25
|
作者
Cafaro, Massimo [1 ,2 ]
Tempesta, Piergiulio [3 ]
机构
[1] Univ Salento, Fac Ingn, Dept Innovat Engn, I-73100 Lecce, Italy
[2] CMCC Euro Mediterranean Ctr Climate Change, Lecce, Italy
[3] Univ Complutense, Fac Fis, Dept Fis Teor 2, E-28040 Madrid, Spain
来源
关键词
data stream; frequent elements; STREAMS;
D O I
10.1002/cpe.1761
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a deterministic parallel algorithm for the k-majority problem, that can be used to find in parallel frequent items, i.e. those whose multiplicity is greater than a given threshold, and is therefore useful to process iceberg queries and in many other different contexts of applied mathematics and information theory. The algorithm can be used both in the online (stream) context and in the offline setting, the difference being that in the former case we are restricted to a single scan of the input elements, so that verifying the frequent items that have been determined is not allowed (e. g. network traffic streams passing through internet routers), while in the latter a parallel scan of the input can be used to determine the actual k-majority elements. To the best of our knowledge, this is the first parallel algorithm solving the proposed problem. Copyright (C) 2011 John Wiley & Sons, Ltd.
引用
下载
收藏
页码:1774 / 1788
页数:15
相关论文
共 50 条
  • [1] Finding frequent items in data streams
    Charikar, M
    Chen, K
    Farach-Colton, M
    THEORETICAL COMPUTER SCIENCE, 2004, 312 (01) : 3 - 15
  • [2] Finding the Frequent Items in Streams of Data
    Cormode, Graham
    Hadjieleftheriou, Marios
    COMMUNICATIONS OF THE ACM, 2009, 52 (10) : 97 - 105
  • [3] Finding frequent items in data streams
    Charikar, M
    Chen, K
    Farach-Colton, M
    AUTOMATA, LANGUAGES AND PROGRAMMING, 2002, 2380 : 693 - 703
  • [4] Finding Frequent Items in Data Streams
    Cormode, Graham
    Hadjieleftheriou, Marios
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1530 - 1541
  • [5] Finding frequent items over data stream
    Tu, Li
    Chen, Ling
    Journal of Computational Information Systems, 2010, 6 (12): : 4127 - 4134
  • [6] An Extension of the Apriori Algorithm for Finding Frequent Items
    Karimtabar, Noorollah
    Fard, Mohammad Javad Shayegan
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 330 - 334
  • [7] Methods for finding frequent items in data streams
    Graham Cormode
    Marios Hadjieleftheriou
    The VLDB Journal, 2010, 19 : 3 - 20
  • [8] Finding frequent items in a turnstile data stream
    Hung, Regant Y. S.
    Lai, Kwok Fai
    Ting, Hing Fung
    COMPUTING AND COMBINATORICS, PROCEEDINGS, 2008, 5092 : 498 - 509
  • [9] Finding hierarchical frequent items in data streams
    Feng, Wenfeng
    Guo, Qiao
    Zhang, Zhibin
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 5972 - +
  • [10] An Efficient Algorithm for Finding Frequent Items in a Stream
    Tu, Li
    Chen, Ling
    Zhang, Shan
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 200 - +