An information-theoretic approach to quantitative association rule mining

被引:0
|
作者
Yiping Ke
James Cheng
Wilfred Ng
机构
[1] The Hong Kong University of Science and Technology,Department of Computer Science and Engineering
来源
关键词
Quantitative databases; Association rules; Information-theoretic approach; Mutual information;
D O I
暂无
中图分类号
学科分类号
摘要
Quantitative association rule (QAR) mining has been recognized an influential research problem over the last decade due to the popularity of quantitative databases and the usefulness of association rules in real life. Unlike boolean association rules (BARs), which only consider boolean attributes, QARs consist of quantitative attributes which contain much richer information than the boolean attributes. However, the combination of these quantitative attributes and their value intervals always gives rise to the generation of an explosively large number of itemsets, thereby severely degrading the mining efficiency. In this paper, we propose an information-theoretic approach to avoid unrewarding combinations of both the attributes and their value intervals being generated in the mining process. We study the mutual information between the attributes in a quantitative database and devise a normalization on the mutual information to make it applicable in the context of QAR mining. To indicate the strong informative relationships among the attributes, we construct a mutual information graph (MI graph), whose edges are attribute pairs that have normalized mutual information no less than a predefined information threshold. We find that the cliques in the MI graph represent a majority of the frequent itemsets. We also show that frequent itemsets that do not form a clique in the MI graph are those whose attributes are not informatively correlated to each other. By utilizing the cliques in the MI graph, we devise an efficient algorithm that significantly reduces the number of value intervals of the attribute sets to be joined during the mining process. Extensive experiments show that our algorithm speeds up the mining process by up to two orders of magnitude. Most importantly, we are able to obtain most of the high-confidence QARs, whereas the QARs that are not returned by MIC are shown to be less interesting.
引用
收藏
页码:213 / 244
页数:31
相关论文
共 50 条
  • [41] An Information-Theoretic Approach to Joint Sensing and Communication
    Ahmadipour, Mehrasa
    Kobayashi, Mari
    Wigger, Michele
    Caire, Giuseppe
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (02) : 1124 - 1146
  • [42] OBJECTIONS TO AN INFORMATION-THEORETIC APPROACH TO SYNCHRONICITY - REPLY
    BRAUDE, SE
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR PSYCHICAL RESEARCH, 1979, 73 (03): : 325 - 330
  • [43] Information-theoretic approach to Fourier transform spectrometry
    Barducci, Alessandro
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA B-OPTICAL PHYSICS, 2011, 28 (04) : 637 - 648
  • [44] Information-theoretic clustering: A representative and evolutionary approach
    Araujo, Daniel
    Doria Neto, Adriao
    Martins, Allan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (10) : 4190 - 4205
  • [45] ON SPECIES PROBLEM WITH AN INFORMATION-THEORETIC APPROACH TO SMOOTHING
    PHILLIPS, RD
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1969, 40 (05): : 1880 - &
  • [46] An information-theoretic approach to stochastic materials modeling
    Zabaras, Nicholas
    Sankaran, Sethuraman
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2007, 9 (02) : 30 - 39
  • [47] An information-theoretic approach for detecting communities in networks
    Yongli Li
    Chong Wu
    Zizheng Wang
    [J]. Quality & Quantity, 2015, 49 : 1719 - 1733
  • [48] An information-theoretic approach to traffic matrix estimation
    Zhang, Y
    Roughan, M
    Lund, C
    Donoho, D
    [J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2003, 33 (04) : 301 - 312
  • [49] An Information-Theoretic Approach for Secure Protocol Composition
    Chiang, Yi-Ting
    Hsu, Tsan-Sheng
    Liau, Churn-Jung
    Liu, Yun-Ching
    Shen, Chih-Hao
    Wang, Da-Wei
    Zhan, Justin
    [J]. INTERNATIONAL CONFERENCE ON SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, SECURECOMM 2014, PT I, 2015, 152 : 405 - 423
  • [50] Complex behavior in biosystems: an information-theoretic approach
    Vasconcellos, AR
    Mesquita, MV
    Luzzi, R
    [J]. CHAOS SOLITONS & FRACTALS, 2000, 11 (08) : 1313 - 1325