Sampling large databases for association rules

被引:0
|
作者
Toivonen, H
机构
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discovery of association rules is an important database mining problem. Current algorithms for finding association rules require several passes over the analyzed database, and obviously the role of I/O overhead is very significant for very large databases. We present new algorithms that reduce the database activity considerably. The idea is to pick a random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database. The algorithms thus produce exact association rules, not approximations based on a sample. The approach is, however, probabilistic, and in those rare cases where our sampling method does not produce all association rules, the missing rules can be found in a second pass. Our experiments show that the proposed algorithms can find association rules very efficiently in only one database pass.
引用
收藏
页码:134 / 145
页数:12
相关论文
共 50 条
  • [1] An algorithm of association rules mining in large databases based on sampling
    Liu, Zhi
    Sun, Tianhong
    Sang, Guoming
    [J]. International Journal of Database Theory and Application, 2013, 6 (06): : 95 - 104
  • [2] An efficient sampling approach for mining all association rules in large databases
    Department of Computer Science and Engineering, Shiraz University, Shiraz, Iran
    [J]. Iran. J. Electr. Comput. Eng., 2008, 1 (73-78):
  • [3] Association rules in very large databases
    不详
    [J]. ASSOCIATION RULE MINING: MODELS AND ALGORITHMS, 2002, 2307 : 161 - 198
  • [4] Discovering Association Rules in Large, Dense Databases
    Teusan, Tudor
    Nachouki, Gilles
    Briand, Henri
    Philippe, Jacques
    [J]. LECTURE NOTES IN COMPUTER SCIENCE <D>, 2000, 1910 : 638 - 645
  • [5] Discovering Association Rules Change from Large Databases
    Ye, Feiyue
    Liu, Jixue
    Qian, Jin
    Shi, Yuxi
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT I, 2011, 7002 : 388 - +
  • [6] Association rules with opposite items in large categorical databases
    Wei, Q
    Chen, GQ
    [J]. FLEXIBLE QUERY ANSWERING SYSTEMS: RECENT ADVANCES, 2001, : 507 - 514
  • [7] Parallel algorithms for mining association rules in large databases
    Kudo, T
    Ashihara, H
    Shimizu, K
    [J]. INTELLIGENT SYSTEMS, 1997, : 125 - 128
  • [8] Efficient mining of categorized association rules in large databases
    Tseng, SM
    [J]. SMC 2000 CONFERENCE PROCEEDINGS: 2000 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOL 1-5, 2000, : 3606 - 3610
  • [9] Mining multiple-level association rules in large databases
    Han, JW
    Fu, WJ
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1999, 11 (05) : 798 - 805
  • [10] PPCI Algorithm for Mining Temporal Association Rules in Large Databases
    Pandey, Anjana
    Pardasani, K.
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2009, 8 (04) : 345 - 352