Implementation of an Improved Algorithm for Frequent Itemset Mining using Hadoop

被引:0
|
作者
Agarwal, Ruchi [1 ]
Singh, Sunny [1 ]
Vats, Satvik [1 ]
机构
[1] Sharda Univ, Dept Comp Sci & Engn, Plot 32,34,Knowledge Pk 3, Greater Noida, UP, India
关键词
Frequent Item-set Mining; Big Data; MapReduce; Distributed Computational Environment;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Searching frequent item-sets in large size heterogeneous databases in minimal time is considered as one of the most important data mining problem. As a solution of this problem, various algorithms have been proposed to speed up execution. Most of the recent proposed algorithms focussed on parallelizing the workload using large number of machine in distributed computational environment like MapReduce framework. A few of them are actually capable to determine the appropriate number of required computing computers, considering workload balancing and execution efficiency. But internally not capable to determine exact number of required iteration for any large size datasets in advance to find out the frequent item-set based on iterative sampling. In this paper, we propose an improved and compact algorithm (ICA) for finding frequent item-set in minimal time, using distributed computational environment. It is also capable of determining the exact number of internal iteration required for any large size datasets whether data is in structured or unstructured format.
引用
收藏
页码:13 / 18
页数:6
相关论文
共 50 条
  • [41] A novel parallel frequent itemset mining algorithm for automatic enterprise
    Mao, Yimin
    Wu, Bin
    Deng, Qianhu
    Mahmoodi, Soroosh
    Chen, Zhigang
    Chen, Yeh-Cheng
    [J]. ENTERPRISE INFORMATION SYSTEMS, 2023, 17 (10)
  • [42] An efficient polynomial delay algorithm for pseudo frequent itemset mining
    Uno, Takeaki
    Arimura, Hiroki
    [J]. DISCOVERY SCIENCE, PROCEEDINGS, 2007, 4755 : 219 - +
  • [43] Frequent itemset mining-based spatial subclustering algorithm
    Wang, Qian
    Gao, Zhi-Peng
    Qiu, Xue-Song
    Wang, Xing-Bin
    [J]. Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2015, 38 : 20 - 23
  • [44] AnyFI: An Anytime Frequent Itemset Mining Algorithm for Data Streams
    Goyal, Poonam
    Challa, Jagat Sesh
    Shrivastava, Shivin
    Goyal, Navneet
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 942 - 947
  • [45] A frequent itemset mining algorithm based on composite granular computing
    Wu, Hongjuan
    Liu, Yulu
    Yan, Pei
    Fang, Gang
    Zhong, Jing
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2018, 18 (01) : 247 - 257
  • [46] Improvement of Eclat Algorithm Based on Support in Frequent Itemset Mining
    Yu, Xiaomei
    Wang, Hong
    [J]. JOURNAL OF COMPUTERS, 2014, 9 (09) : 2116 - 2123
  • [47] A Heuristic Rule based Approximate Frequent Itemset Mining Algorithm
    Li, Haifeng
    Zhang, Yuejin
    Zhang, Ning
    Jia, Hengyue
    [J]. PROMOTING BUSINESS ANALYTICS AND QUANTITATIVE MANAGEMENT OF TECHNOLOGY: 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT (ITQM 2016), 2016, 91 : 324 - 333
  • [48] Improved Visualization of Frequent Itemset Relationships Using the Minimal Spanning Tree Algorithm
    Vranic, Mihaela
    Pintar, Damir
    Skopljanac-Macina, Frano
    [J]. TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2019, 26 (02): : 331 - 338
  • [49] A Spark-based Incremental Algorithm for Frequent Itemset Mining
    Wen, Haoxing
    Li, Xiaoguang
    Kou, Mingdong
    Tou, Huaixiao
    He, Hengyi
    Yang, Yulu
    [J]. BDIOT 2018: PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS, 2018, : 53 - 58
  • [50] An algorithm for in-core frequent itemset mining on streaming data
    Jin, RM
    Agrawal, G
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 210 - 217