A novel multi-core algorithm for frequent itemsets mining in data streams

被引:2
|
作者
Bustio-Martinez, Lazaro [1 ]
Munoz-Briseno, Alfredo [2 ]
Cumplido, Rene [1 ]
Hernandez-Leon, Raudel [2 ]
Feregrino-Uribe, Claudia A. [1 ]
机构
[1] Natl Inst Astrophys Opt & Elect, Luis Enrique Erro 1, Puebla 72840, Mexico
[2] Adv Technol Applicat Ctr, 7a 21406 E 214&216, Havana 12200, Cuba
关键词
Frequent itemsets mining; Data streams; Lexicographic order; Gearman; Parallel algorithms; SPARK;
D O I
10.1016/j.patrec.2019.05.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data streams are modern data sources that are gaining attention as a consequence of their many practical applications (they can be found in data transmission, eCommerce, and intrusion detection system among others). Nevertheless, the efforts to obtain insights from data streams are limited due to their massive information volume and the time needed to process them. In this paper, a new approach for Frequent Itemsets Mining on data streams based on prefix trees which takes advantage of multi-core systems is proposed. This approach uses the Gearman framework as the interface for multi-core processing, and it allows to exploit their scalability efficiently. Experimental results show that the proposed method obtains the same patterns compared with similar approaches reported in the state-of-the-art and outperforms them concerning the processing time required. Also, it is proved that the proposed method is insensitive to variations in the support threshold value, and its efficiency depends on the size of the transactions and not on the size of the alphabet, which is a significant issue in other Frequent Itemsets Mining algorithms. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:241 / 248
页数:8
相关论文
共 50 条
  • [21] An efficient approach to mining frequent itemsets on data streams
    Ansari, Sara
    Sadreddini, Mohammad Hadi
    [J]. World Academy of Science, Engineering and Technology, 2009, 37 : 489 - 495
  • [22] Mining maximal frequent itemsets from data streams
    Mao, Guojun
    Wu, Xindong
    Zhu, Xingquan
    Chen, Gong
    Liu, Chunnian
    [J]. JOURNAL OF INFORMATION SCIENCE, 2007, 33 (03) : 251 - 262
  • [23] Fast Mining of Closed Frequent Itemsets in Data Streams
    Mao Yimin
    Chen Zhigang
    Liu Lixin
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 231 - +
  • [24] Mining of Frequent Itemsets from Streams of Uncertain Data
    Leung, Carson Kai-Sang
    Hao, Boyu
    [J]. ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 1663 - 1670
  • [25] Efficient mining of frequent itemsets from data streams
    Leung, Carson Kai-Sang
    Brajczuk, Dale A.
    [J]. SHARING DATA, INFORMATION AND KNOWLEDGE, PROCEEDINGS, 2008, 5071 : 2 - 14
  • [26] A novel incremental algorithm for mining frequent itemsets
    Wang, YL
    Li, ZZ
    Xue, J
    Ban, SM
    Zhao, YL
    [J]. DCABES 2002, PROCEEDING, 2002, : 60 - 64
  • [27] FIAST: A Novel Algorithm for Mining Frequent Itemsets
    Duemong, Fudailah
    Preechaveerakul, Ladda
    Vanichayobon, Sirirut
    [J]. INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 140 - 144
  • [28] Variable slide window based frequent itemsets mining algorithm on large data streams
    Zhu, Xiao-Dong
    Huang, Zhi-Qiu
    Shen, Guo-Hua
    Yuan, Min
    [J]. Kongzhi yu Juece/Control and Decision, 2009, 24 (06): : 832 - 836
  • [29] A frequent itemsets mining algorithm based on matrix in sliding window over data streams
    Fan Guidan
    Yin Shaohong
    [J]. 2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA), 2013, : 66 - 69
  • [30] An Efficient Subset-Lattice Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Chang, Ye-In
    Li, Chia-En
    Peng, Wei-Hau
    [J]. 2012 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2012, : 21 - 26