A novel multi-core algorithm for frequent itemsets mining in data streams

被引:2
|
作者
Bustio-Martinez, Lazaro [1 ]
Munoz-Briseno, Alfredo [2 ]
Cumplido, Rene [1 ]
Hernandez-Leon, Raudel [2 ]
Feregrino-Uribe, Claudia A. [1 ]
机构
[1] Natl Inst Astrophys Opt & Elect, Luis Enrique Erro 1, Puebla 72840, Mexico
[2] Adv Technol Applicat Ctr, 7a 21406 E 214&216, Havana 12200, Cuba
关键词
Frequent itemsets mining; Data streams; Lexicographic order; Gearman; Parallel algorithms; SPARK;
D O I
10.1016/j.patrec.2019.05.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data streams are modern data sources that are gaining attention as a consequence of their many practical applications (they can be found in data transmission, eCommerce, and intrusion detection system among others). Nevertheless, the efforts to obtain insights from data streams are limited due to their massive information volume and the time needed to process them. In this paper, a new approach for Frequent Itemsets Mining on data streams based on prefix trees which takes advantage of multi-core systems is proposed. This approach uses the Gearman framework as the interface for multi-core processing, and it allows to exploit their scalability efficiently. Experimental results show that the proposed method obtains the same patterns compared with similar approaches reported in the state-of-the-art and outperforms them concerning the processing time required. Also, it is proved that the proposed method is insensitive to variations in the support threshold value, and its efficiency depends on the size of the transactions and not on the size of the alphabet, which is a significant issue in other Frequent Itemsets Mining algorithms. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:241 / 248
页数:8
相关论文
共 50 条
  • [1] An Efficient Mining Algorithm of Closed Frequent Itemsets on Multi-core Processor
    Phan, Huan
    [J]. ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019, 2019, 11888 : 107 - 118
  • [2] Efficient mining algorithm of frequent itemsets for uncertain data streams
    Wang Qianqian
    Liu Fang-ai
    [J]. PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2016, : 443 - 446
  • [3] An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Ao, Fujiang
    Du, Jing
    Yan, Yuejin
    Liu, Baohong
    Huang, Kedi
    [J]. 8TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY WORKSHOPS: CIT WORKSHOPS 2008, PROCEEDINGS, 2008, : 37 - +
  • [4] Uncertain Frequent Itemsets Mining Algorithm on Data Streams with Constraints
    Yu, Qun
    Tang, Ke-Ming
    Tang, Shi-Xi
    Lv, Xin
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 192 - 201
  • [5] Mining Frequent Itemsets in Data Streams Based on Genetic Algorithm
    Han, Chong
    Sun, Lijuan
    Guo, Jian
    Chen, Xiaodong
    [J]. 2013 15TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2013, : 748 - 753
  • [6] A Novel Strategy for Mining Frequent Closed Itemsets in Data Streams
    Tang, Keming
    Dai, Caiyan
    Chen, Ling
    [J]. JOURNAL OF COMPUTERS, 2012, 7 (07) : 1564 - 1573
  • [7] A New Algorithm for Mining Frequent Closed Itemsets from Data Streams
    Mao, Guojun
    Yang, Xialing
    Wu, Xindong
    [J]. 2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 154 - +
  • [8] An Efficient Frequent Closed Itemsets Mining Algorithm Over Data Streams
    Tan, Jun
    Bu, Yingyong
    Yang, Bo
    [J]. 2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 3, PROCEEDINGS, 2009, : 65 - +
  • [9] An algorithm for mining frequent closed itemsets with density from data streams
    Dai Caiyan
    Chen Ling
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2016, 12 (2-3) : 146 - 154
  • [10] An Efficient Frequent Closed Itemsets Mining Algorithm Over Data Streams
    Tan, Jun
    Yu, Shao-jun
    [J]. 2011 SECOND INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND EDUCATION APPLICATION (ICEA 2011), 2011, : 197 - 201