Discovering frequent itemsets in the presence of highly frequent items

被引:0
|
作者
Groth, DP [1 ]
Robertson, EL [1 ]
机构
[1] Indiana Univ, Sch Informat, Bloomington, IN 47405 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents new techniques for focusing the discovery of frequent itemsets within large, dense datasets containing highly frequent items. The existence of highly frequent items adds significantly to the cost of computing the complete set of frequent itemsets. Our approach allows for the exclusion of such items during-the candidate generation phase of the Apriori algorithm. Afterwards, the highly frequent items can be reintroduced, via an inferencing framework, providing for a capability to generate frequent itemsets without counting their frequency. We demonstrate the use of these new techniques within the well-studied framework of the Apriori algorithm. Furthermore, we provide empirical results using our techniques on both synthetic and real datasets - both relevant since the real datasets exhibit statistical characteristics different from the probabilistic assumptions behind the synthetic data. The source we used for real data was the U.S. Census.
引用
收藏
页码:251 / 264
页数:14
相关论文
共 50 条
  • [1] Automatic discovery of locally frequent itemsets in the presence of highly frequent itemsets
    Bodon, Ferenc
    Kouris, Ioannis N.
    Makris, Christos H.
    Tsakalidis, Athanasios K.
    [J]. INTELLIGENT DATA ANALYSIS, 2005, 9 (01) : 83 - 104
  • [2] A Frequent Item Graph Approach for Discovering Frequent Itemsets
    Kumar, A. V. Senthil
    Wahidabanu, R. S. D.
    [J]. 2008 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING, 2008, : 952 - +
  • [3] And code algorithm for discovering frequent itemsets
    Zhou, HY
    Zhang, Y
    [J]. THIRD INTERNATIONAL CONFERENCE ON ELECTRONIC COMMERCE ENGINEERING: DIGITAL ENTERPRISES AND NONTRADITIONAL INDUSTRIALIZATION, 2003, : 569 - 572
  • [4] CloseMiner: Discovering frequent closed itemsets using frequent closed tidsets
    Singh, NG
    Singh, SR
    Mahanta, AK
    [J]. Fifth IEEE International Conference on Data Mining, Proceedings, 2005, : 633 - 636
  • [5] Discovering frequent closed itemsets for association rules
    Pasquier, N
    Bastide, Y
    Taouil, R
    Lakhal, L
    [J]. DATABASE THEORY - ICDT'99, 1999, 1540 : 398 - 416
  • [6] A Dynamic Approach for Discovering Maximal Frequent itemsets
    Geetha, M.
    D'Souza, R. J.
    [J]. 2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND TECHNOLOGY, VOL II, PROCEEDINGS, 2009, : 62 - +
  • [7] A Hybrid Method for Discovering Maximal Frequent Itemsets
    Chen, Fu-zan
    Li, Min-qiang
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 546 - 550
  • [8] Discovering frequent itemsets using transaction identifiers
    Chai, DJ
    Choi, HY
    Hwang, BY
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 1, PROCEEDINGS, 2005, 3613 : 1175 - 1184
  • [9] Discovering frequent itemsets by support approximation and itemset clustering
    Jea, Kuen-Fang
    Chang, Ming-Yuan
    [J]. DATA & KNOWLEDGE ENGINEERING, 2008, 65 (01) : 90 - 107
  • [10] An algebraic semigroup method for discovering maximal frequent itemsets
    Liu, Jiang
    Li, Jing
    Ni, Feng
    Xia, Xiang
    Li, Shunlong
    Dong, Wenhui
    [J]. OPEN MATHEMATICS, 2022, 20 (01): : 1432 - 1443