An information-theoretic approach to quantitative association rule mining

被引:0
|
作者
Yiping Ke
James Cheng
Wilfred Ng
机构
[1] The Hong Kong University of Science and Technology,Department of Computer Science and Engineering
来源
关键词
Quantitative databases; Association rules; Information-theoretic approach; Mutual information;
D O I
暂无
中图分类号
学科分类号
摘要
Quantitative association rule (QAR) mining has been recognized an influential research problem over the last decade due to the popularity of quantitative databases and the usefulness of association rules in real life. Unlike boolean association rules (BARs), which only consider boolean attributes, QARs consist of quantitative attributes which contain much richer information than the boolean attributes. However, the combination of these quantitative attributes and their value intervals always gives rise to the generation of an explosively large number of itemsets, thereby severely degrading the mining efficiency. In this paper, we propose an information-theoretic approach to avoid unrewarding combinations of both the attributes and their value intervals being generated in the mining process. We study the mutual information between the attributes in a quantitative database and devise a normalization on the mutual information to make it applicable in the context of QAR mining. To indicate the strong informative relationships among the attributes, we construct a mutual information graph (MI graph), whose edges are attribute pairs that have normalized mutual information no less than a predefined information threshold. We find that the cliques in the MI graph represent a majority of the frequent itemsets. We also show that frequent itemsets that do not form a clique in the MI graph are those whose attributes are not informatively correlated to each other. By utilizing the cliques in the MI graph, we devise an efficient algorithm that significantly reduces the number of value intervals of the attribute sets to be joined during the mining process. Extensive experiments show that our algorithm speeds up the mining process by up to two orders of magnitude. Most importantly, we are able to obtain most of the high-confidence QARs, whereas the QARs that are not returned by MIC are shown to be less interesting.
引用
收藏
页码:213 / 244
页数:31
相关论文
共 50 条
  • [1] An information-theoretic approach to quantitative association rule mining
    Ke, Yiping
    Cheng, James
    Ng, Wilfred
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 16 (02) : 213 - 244
  • [2] Using information-theoretic measures to assess association rule interestingness
    Blanchard, J
    Guillet, F
    Gras, R
    Briand, H
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 66 - 73
  • [3] Information-theoretic fuzzy approach to data reliability and data mining
    Maimon, O
    Kandel, A
    Last, M
    [J]. FUZZY SETS AND SYSTEMS, 2001, 117 (02) : 183 - 194
  • [4] An Information-Theoretic Approach for Unsupervised Topic Mining in Large Text Collections
    Ramirez, Eduardo H.
    Brena, Ramon F.
    [J]. 2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 331 - 334
  • [5] An Information-theoretic Approach to Distribution Shifts
    Federici, Marco
    Tomioka, Ryota
    Forre, Patrick
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Information-Theoretic Approach to Bidirectional Scaling
    Boso, Francesca
    Tartakovsky, Daniel M.
    [J]. WATER RESOURCES RESEARCH, 2018, 54 (07) : 4916 - 4928
  • [7] An information-theoretic approach to band selection
    Ahlberg, J
    Renhorn, I
    [J]. Targets and Backgrounds XI: Characterization and Representation, 2005, 5811 : 15 - 23
  • [8] Information-theoretic approach to steganographic systems
    Ryabko, Boris
    Ryabko, Daniil
    [J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS, VOLS 1-7, 2007, : 2461 - +
  • [9] An information-theoretic approach to steganography and watermarking
    Mittelholzer, T
    [J]. INFORMATION HIDING, PROCEEDINGS, 2000, 1768 : 1 - 16
  • [10] An information-theoretic approach to interactions in images
    Boccignone, G
    Ferraro, M
    [J]. SPATIAL VISION, 1999, 12 (03): : 345 - 362