An Enhanced Apriori Algorithm Using Hybrid Data Layout Based on Hadoop for Big Data Processing

被引:0
|
作者
Rochd, Yassir [1 ]
Hafidi, Imad [1 ]
机构
[1] Hassan I Univ, Natl Sch Appl Sci, IPOSI Lab, Khouribga, Morocco
关键词
Data mining; Frequent itemset mining; Apriori; Big data; Hadoop;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Frequent itemset mining is one of the data mining methodes implemeted to find frequent patterns, utilized in prediction, association rule mining, classification, etc. Apriori algorithm is an iterative method, that is used to discover frequent itemsets from transactional dataset. It scans entire dataset in every iteration to come up with the large frequent itemsets of various cardinality, which sounds efficient for small data but not useful for big data. To resolve the problem of treatment dataset in every iteration, we present an algorithm called Hybrid Frequent Itemset Mining on Hadoop (HFIMH) which uses the vertical layout of dataset to solve the problem of treatment the dataset in every iteration. Vertical dataset conveys information to discover support of every itemsets, and the idea of set intersection is utilized to compute it. We compare the execution of HFIMH with another Hadoop based implementation of Apriori algorithm for different datasets. Experimental results demonstrate that our approach is better.
引用
收藏
页码:161 / 167
页数:7
相关论文
共 50 条
  • [21] A Big Data Framework for Mining Sensor Data Using Hadoop
    El-Shafeiy, Engy A.
    El-Desouky, Ali I.
    [J]. STUDIES IN INFORMATICS AND CONTROL, 2017, 26 (03): : 365 - 376
  • [22] Architecture of Efficient Word Processing using Hadoop MapReduce for Big Data Applications
    Mandal, Bichitra
    Sahoo, Ramesh Kumar
    Sethi, Srinivas
    [J]. PROCEEDINGS 2015 INTERNATIONAL CONFERENCE ON MAN AND MACHINE INTERFACING (MAMI), 2015,
  • [23] Architecture of Geospatial Big-Data Batch Processing Model Based on Hadoop
    Kim, Sang-Su
    Yu, Sung-Hwan
    [J]. 2015 INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC), 2015, : 964 - 966
  • [24] Security framework using Hadoop for Big Data
    Johri, Prashant
    Kumar, Arun
    Das, Sanjoy
    Arora, Sanchita
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 268 - 272
  • [25] Big Data Compression using SPIHT in Hadoop
    Jati, Grafika
    Kusuma, Ilham
    Hilman, M. H.
    Jatmiko, Wisnu
    [J]. 2016 INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS), 2016, : 133 - 137
  • [26] Big Data Analysis using Apache Hadoop
    Manikandan, Shankar Ganesh
    Ravi, Siddarth
    [J]. 2014 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2014,
  • [27] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [28] Big Data Analysis Using Hadoop Cluster
    Saldhi, Ankita
    Goel, Abhinav
    Yadav, Dipesh
    Saldhi, Ankur
    Saksena, Dhruv
    Indu, S.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 572 - 575
  • [29] Research on adaptive recommendation algorithm for big data mining based on Hadoop platform
    Zhang, Jinming
    [J]. INTERNATIONAL JOURNAL OF INTERNET PROTOCOL TECHNOLOGY, 2019, 12 (04) : 213 - 220
  • [30] Block Storage Optimization and Parallel Data Processing and Analysis of Product Big Data Based on the Hadoop Platform
    Wang, Yajun
    Cheng, Shengming
    Zhang, Xinchen
    Leng, Junyu
    Liu, Jun
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021