An Enhanced Apriori Algorithm Using Hybrid Data Layout Based on Hadoop for Big Data Processing

被引:0
|
作者
Rochd, Yassir [1 ]
Hafidi, Imad [1 ]
机构
[1] Hassan I Univ, Natl Sch Appl Sci, IPOSI Lab, Khouribga, Morocco
关键词
Data mining; Frequent itemset mining; Apriori; Big data; Hadoop;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Frequent itemset mining is one of the data mining methodes implemeted to find frequent patterns, utilized in prediction, association rule mining, classification, etc. Apriori algorithm is an iterative method, that is used to discover frequent itemsets from transactional dataset. It scans entire dataset in every iteration to come up with the large frequent itemsets of various cardinality, which sounds efficient for small data but not useful for big data. To resolve the problem of treatment dataset in every iteration, we present an algorithm called Hybrid Frequent Itemset Mining on Hadoop (HFIMH) which uses the vertical layout of dataset to solve the problem of treatment the dataset in every iteration. Vertical dataset conveys information to discover support of every itemsets, and the idea of set intersection is utilized to compute it. We compare the execution of HFIMH with another Hadoop based implementation of Apriori algorithm for different datasets. Experimental results demonstrate that our approach is better.
引用
收藏
页码:161 / 167
页数:7
相关论文
共 50 条
  • [1] Improving Data Processing Speed on Large Datasets in a Hadoop Multi-node Cluster using Enhanced Apriori Algorithm
    Sundarakumar, M. R.
    Sharma, Ravi
    Fathima, S. K.
    Rajan, V. Gokul
    Dhayanithi, J.
    Marimuthu, M.
    Mohanraj, G.
    Sharma, Aditi
    Renoald, A. Johny
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (04) : 6161 - 6177
  • [2] Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
    Hussein, Eslam
    Sadiki, Ronewa
    Jafta, Yahlieel
    Sungay, Muhammad Mujahid
    Ajayi, Olasupo
    Bagula, Antoine
    [J]. E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 180 - 185
  • [4] Discussion and Improvement of Apriori Algorithm of Data Mining Based on Hadoop Platform
    Zhao, Mengyang
    Tang, Bo
    Yang, Le
    [J]. PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017), 2017, 130 : 183 - 187
  • [5] Enhanced Tele ECG System Using Hadoop Frawork To Deal With Big Data Processing
    Ma'sum, M. Anwar
    Jatmiko, Wisnu
    Suhartanto, Heru
    [J]. 2016 INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS), 2016, : 121 - 126
  • [6] AN ALGORITHM OF APRIORI BASED ON MEDICAL BIG DATA AND CLOUD COMPUTING
    Cui, Xiaoyan
    Yang, Shimeng
    Wang, Daming
    [J]. PROCEEDINGS OF 2016 4TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (IEEE CCIS 2016), 2016, : 361 - 365
  • [7] Processing of Big Educational Data in the Cloud Using Apache Hadoop
    Machova, Renata
    Komarkova, Jitka
    Lnenicka, Martin
    [J]. INTERNATIONAL CONFERENCE ON INFORMATION SOCIETY (I-SOCIETY 2016), 2016, : 46 - 49
  • [8] Mining Algorithm for Association Rules in Big Data Based on Hadoop
    Fu, Chunhua
    Wang, Xiaojing
    Zhang, Lijun
    Qiao, Liying
    [J]. ADVANCES IN MATERIALS, MACHINERY, ELECTRONICS II, 2018, 1955
  • [9] Parallel Implementation of PrePost Algorithm Based on Hadoop for Big Data
    Rochd, Yassir
    Hafidi, Imad
    [J]. 2018 IEEE 5TH INTERNATIONAL CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'18), 2018, : 24 - 28
  • [10] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015