IOMRA - A High Efficiency Frequent Itemset Mining Algorithm Based on the MapReduce Computation Model

被引:0
|
作者
Liu, Sheng-Hui [1 ]
Liu, Shi-Jia [1 ]
Chen, Shi-Xuan [2 ]
Yu, Kun-Ming [2 ]
机构
[1] Harbin Univ Sci & Technol, Sch Software, Harbin, Heilongjiang, Peoples R China
[2] Chung Hua Univ, Dept Comp Sci & Informat Engn, Hsinchu, Taiwan
关键词
Frequent Itemset Mining; Aprior; MapReduce; Hadoop;
D O I
10.1109/CSE.2014.247
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The goal of Frequent Itemset Mining (FIM) is to find the biggest number of frequently used subsets from a big transaction database. In previous studies, using the advantage of multicore computing, the execution time of an Apriori algorithm was sharply decreased: when the size of a data set was more than TBs and a single host had been unable to afford a large number of operations by using a number of computers connected into a super computer to speed up execution as being the obvious solution. Some parallel Apriori algorithms, based on the MapReduce framework, have been proposed. However, with these algorithms, memory would be quickly exhausted and communication cost would rise sharply. This would greatly reduce execution efficiency. In this paper, we present an improved reformative Apriori algorithm that uses the length of each transaction to determine the size of the maximum merge candidate itemsets. By reducing the production of low frequency itemsets in Map function, memory exhaustion is ameliorated, greatly improving execution efficiency.
引用
收藏
页码:1290 / 1295
页数:6
相关论文
共 50 条
  • [1] MapReduce Based Frequent Itemset Mining Algorithm on Stream Data
    Chaudhary, Hemant
    Yadav, Deepak Kumar
    Bhatnagar, Rajat
    Chandrasekhar, Uddagiri
    [J]. 2015 GLOBAL CONFERENCE ON COMMUNICATION TECHNOLOGIES (GCCT), 2015, : 586 - 591
  • [2] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao, Yimin
    Geng, Junhao
    Mwakapesa, Deborah Simon
    Nanehkaran, Yaser Ahangari
    Chi, Zhang
    Deng, Xiaoheng
    Chen, Zhigang
    [J]. MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
  • [3] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao Yimin
    Geng Junhao
    Deborah Simon Mwakapesa
    Yaser Ahangari Nanehkaran
    Zhang Chi
    Deng Xiaoheng
    Chen Zhigang
    [J]. Multimedia Systems, 2021, 27 : 709 - 722
  • [4] A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
    Fumarola, Fabio
    Malerba, Donato
    [J]. 2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 335 - 342
  • [5] ParallelCharMax: An Effective Maximal Frequent Itemset Mining Algorithm Based on MapReduce Framework
    Gahar, Rania Mkhinini
    Arfaoui, Olfa
    Sassi Hidri, Minyar
    Ben Hadj-Alouane, Nejib
    [J]. 2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 571 - 578
  • [6] Frequent Itemset Mining using Improved Apriori Algorithm with MapReduce
    Tribhuvan, Seema A.
    Gavai, Nitin R.
    Vasgi, Bharti P.
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2017,
  • [7] A Novel Nodesets-Based Frequent Itemset Mining Algorithm for Big Data using MapReduce
    Sivaiah, Borra
    Rao, Ramisetty Rajeswara
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 1051 - 1058
  • [8] MapReduce-based Frequent Itemset Mining for Analysis of Electronic Evidence
    Jiang, Xueqing
    Sun, Guozi
    [J]. 2013 EIGHTH INTERNATIONAL WORKSHOP ON SYSTEMATIC APPROACHES TO DIGITAL FORENSIC ENGINEERING (SADFE), 2013,
  • [9] SmartCache: An Optimized MapReduce Implementation of Frequent Itemset Mining
    Huang, Dachuan
    Song, Yang
    Routray, Ramani
    Qin, Feng
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E 2015), 2015, : 16 - 25
  • [10] Fast Mining Algorithm of Frequent Itemset Based on Spark
    Ding, Jia-Man
    Li, Hai-Bin
    Deng, Bin
    Jia, Lian-Yin
    You, Jin-Guo
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2446 - 2464