Discovering frequent itemsets in the presence of highly frequent items

被引:0
|
作者
Groth, DP [1 ]
Robertson, EL [1 ]
机构
[1] Indiana Univ, Sch Informat, Bloomington, IN 47405 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents new techniques for focusing the discovery of frequent itemsets within large, dense datasets containing highly frequent items. The existence of highly frequent items adds significantly to the cost of computing the complete set of frequent itemsets. Our approach allows for the exclusion of such items during-the candidate generation phase of the Apriori algorithm. Afterwards, the highly frequent items can be reintroduced, via an inferencing framework, providing for a capability to generate frequent itemsets without counting their frequency. We demonstrate the use of these new techniques within the well-studied framework of the Apriori algorithm. Furthermore, we provide empirical results using our techniques on both synthetic and real datasets - both relevant since the real datasets exhibit statistical characteristics different from the probabilistic assumptions behind the synthetic data. The source we used for real data was the U.S. Census.
引用
收藏
页码:251 / 264
页数:14
相关论文
共 50 条
  • [41] Mining frequent itemsets in a stream
    Calders, Toon
    Dexters, Nele
    Goethals, Bart
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 83 - +
  • [42] An Algorithm for Mining Frequent Itemsets
    Hernandez Leon, Raudel
    Perez Suarez, Airel
    Feregrino Uribe, Claudia
    Guzman Zavaleta, Zobeida Jezabel
    [J]. 2008 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTING SCIENCE AND AUTOMATIC CONTROL (CCE 2008), 2008, : 236 - +
  • [43] Reference itemsets: useful itemsets to approximate the representation of frequent itemsets
    Huang, Jheng-Nan
    Hong, Tzung-Pei
    Chiang, Ming-Chao
    [J]. SOFT COMPUTING, 2017, 21 (20) : 6143 - 6157
  • [44] Reference itemsets: useful itemsets to approximate the representation of frequent itemsets
    Jheng-Nan Huang
    Tzung-Pei Hong
    Ming-Chao Chiang
    [J]. Soft Computing, 2017, 21 : 6143 - 6157
  • [45] Maintaining Only Frequent Itemsets to Mine Approximate Frequent Itemsets over Online Data Streams
    Wang, Yongyan
    Li, Kun
    Wang, Hongan
    [J]. 2009 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, 2009, : 381 - 388
  • [46] Direct candidates generation: A novel algorithm for discovering complete share-frequent itemsets
    Li, YC
    Yeh, JS
    Chang, CC
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 551 - 560
  • [47] Mining frequent items and itemsets from distributed data streams for emergency detection and management
    Altomare, Albino
    Cesario, Eugenio
    Talia, Domenico
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2017, 8 (01) : 47 - 55
  • [48] Mining frequent items and itemsets from distributed data streams for emergency detection and management
    Albino Altomare
    Eugenio Cesario
    Domenico Talia
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2017, 8 : 47 - 55
  • [49] The Mining Algorithm of Maximum Frequent Itemsets Based on Frequent Pattern Tree
    Mi, Xifeng
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [50] Mining frequent closed itemsets using conditional frequent pattern tree
    Singh, SR
    Patra, BK
    Giri, D
    [J]. Proceedings of the IEEE INDICON 2004, 2004, : 501 - 504