Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

被引:0
|
作者
Jawad Saif-Ur-Rehman
Asad Ashraf
Abdus Habib
机构
[1] Kohat University of Science and Technology,Institute of Information Technology
[2] Abasyn University,Computer Science Department
来源
关键词
Frequent itemsets; Association rules; Identical frequent itemsets (IFIs); Candidate- itemsets-search tree;
D O I
暂无
中图分类号
学科分类号
摘要
Frequent itemsets (FIs) mining is a prime research area in association rule mining. The customary techniques find FIs or its variants on the basis of either support threshold value or by setting two generic parameters, i.e., N (topmost itemsets) and Kmax\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_\mathrm{{max}}$$\end{document} (size of the itemsets). However, users are unable to mine the absolute desired number of patterns because they tune these approaches with their approximate parameters settings. We proposed a novel technique, top-K Miner that does not require setting of support threshold, N and Kmax\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_\mathrm{{max}}$$\end{document} values. Top-K Miner requires the user to specify only a single parameter, i.e., K to find the desired number of frequent patterns called identical frequent itemsets (IFIs). Top-K Miner uses a novel candidate production algorithm called join-FI algorithm. This algorithm uses frequent 2-itemsets to yield one or more candidate itemsets of arbitrary size. The join-FI algorithm follows bottom-up recursive technique to construct candidate-itemsets-search tree. Finally, the generated candidate itemsets are manipulated by the Maintain-Top-K_List algorithm to produce Top-K_List of the IFIs. The proposed top-K Miner algorithm significantly outperforms the generic benchmark techniques even when they are running with the ideal parameters settings.
引用
收藏
页码:741 / 762
页数:21
相关论文
共 50 条
  • [1] Top-K Miner: top-K identical frequent itemsets discovery without user support threshold
    Saif-Ur-Rehman
    Ashraf, Jawad
    Habib, Asad
    Salam, Abdus
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 48 (03) : 741 - 762
  • [2] Mining top-k frequent patterns without minimum support threshold
    Salam, Abdus
    Khayal, M. Sikandar Hayat
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 30 (01) : 57 - 86
  • [3] Mining top-K frequent itemsets through progressive sampling
    Andrea Pietracaprina
    Matteo Riondato
    Eli Upfal
    Fabio Vandin
    [J]. Data Mining and Knowledge Discovery, 2010, 21 : 310 - 326
  • [4] Finding Top-k Fuzzy Frequent Itemsets from Databases
    Li, Haifeng
    Wang, Yue
    Zhang, Ning
    Zhang, Yuejin
    [J]. DATA MINING AND BIG DATA, DMBD 2017, 2017, 10387 : 22 - 30
  • [5] Efficient algorithms of mining top-k frequent closed itemsets
    Lan Yongjie
    Qiu Yong
    [J]. ICEMI 2007: PROCEEDINGS OF 2007 8TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL II, 2007, : 551 - 554
  • [6] Efficient incremental mining of top-K frequent closed itemsets
    Pietracaprina, Andrea
    Vandin, Fabio
    [J]. DISCOVERY SCIENCE, PROCEEDINGS, 2007, 4755 : 275 - +
  • [7] Mining top-K frequent itemsets from data streams
    Wong, Raymond Chi-Wing
    Fu, Ada Wai-Chee
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2006, 13 (02) : 193 - 217
  • [8] Mining top-K frequent itemsets through progressive sampling
    Pietracaprina, Andrea
    Riondato, Matteo
    Upfal, Eli
    Vandin, Fabio
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 21 (02) : 310 - 326
  • [9] Top-k-FCI: mining top-k frequent closed itemsets in data streams
    Li, Jun
    Gong, Sen
    [J]. Journal of Computational Information Systems, 2011, 7 (13): : 4819 - 4826
  • [10] Mining top-K frequent itemsets from data streams
    Raymond Chi-Wing Wong
    Ada Wai-Chee Fu
    [J]. Data Mining and Knowledge Discovery, 2006, 13 : 193 - 217