An information-theoretic approach to quantitative association rule mining

被引:0
|
作者
Yiping Ke
James Cheng
Wilfred Ng
机构
[1] The Hong Kong University of Science and Technology,Department of Computer Science and Engineering
来源
关键词
Quantitative databases; Association rules; Information-theoretic approach; Mutual information;
D O I
暂无
中图分类号
学科分类号
摘要
Quantitative association rule (QAR) mining has been recognized an influential research problem over the last decade due to the popularity of quantitative databases and the usefulness of association rules in real life. Unlike boolean association rules (BARs), which only consider boolean attributes, QARs consist of quantitative attributes which contain much richer information than the boolean attributes. However, the combination of these quantitative attributes and their value intervals always gives rise to the generation of an explosively large number of itemsets, thereby severely degrading the mining efficiency. In this paper, we propose an information-theoretic approach to avoid unrewarding combinations of both the attributes and their value intervals being generated in the mining process. We study the mutual information between the attributes in a quantitative database and devise a normalization on the mutual information to make it applicable in the context of QAR mining. To indicate the strong informative relationships among the attributes, we construct a mutual information graph (MI graph), whose edges are attribute pairs that have normalized mutual information no less than a predefined information threshold. We find that the cliques in the MI graph represent a majority of the frequent itemsets. We also show that frequent itemsets that do not form a clique in the MI graph are those whose attributes are not informatively correlated to each other. By utilizing the cliques in the MI graph, we devise an efficient algorithm that significantly reduces the number of value intervals of the attribute sets to be joined during the mining process. Extensive experiments show that our algorithm speeds up the mining process by up to two orders of magnitude. Most importantly, we are able to obtain most of the high-confidence QARs, whereas the QARs that are not returned by MIC are shown to be less interesting.
引用
收藏
页码:213 / 244
页数:31
相关论文
共 50 条
  • [21] An Information-Theoretic approach for Bug Triaging
    Yadav, Asmita
    Singh, Sandccp Kumar
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE CONFLUENCE 2018 ON CLOUD COMPUTING, DATA SCIENCE AND ENGINEERING, 2018, : 7 - 13
  • [22] An Information-Theoretic Approach to Analyzing CLEAN
    Bose, Ranjan
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2014, 50 (03) : 1673 - 1679
  • [23] An information-theoretic approach for the quantification of relevance
    Polani, Daniel
    Martinetz, Thomas
    Kim, Jan
    [J]. ADVANCES IN ARTIFICIAL LIFE, 2001, 2159 : 704 - 713
  • [24] An information-theoretic framework for process structure and data mining
    Chiaravalloti, Antonio D.
    Greco, Gianluigi
    Guzzo, Antonella
    Pontieri, Luigi
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 248 - 259
  • [25] A geometric approach to information-theoretic private information retrieval
    Woodruff, D
    Yekhanin, S
    [J]. TWENTIETH ANNUAL IEEE CONFERENCE ON COMPUTATIONAL COMPLEXITY, PROCEEDINGS, 2005, : 275 - 284
  • [26] An information-theoretic approach to statistical dependence: Copula information
    Calsaverini, R. S.
    Vicente, R.
    [J]. EPL, 2009, 88 (06)
  • [27] A geometric approach to information-theoretic private information retrieval
    Woodruff, David
    Yekhanin, Sergey
    [J]. SIAM JOURNAL ON COMPUTING, 2007, 37 (04) : 1046 - 1056
  • [28] Image Information Mining System Evaluation Using Information-Theoretic Measures
    Herbert Daschiel
    Mihai Datcu
    [J]. EURASIP Journal on Advances in Signal Processing, 2005
  • [29] Image information mining system evaluation using information-theoretic measures
    [J]. Daschiel, H. (herbert.daschiel@dlr.de), 1600, Hindawi Publishing Corporation (2005):
  • [30] Image information mining system evaluation using information-theoretic measures
    Daschiel, H
    Datcu, M
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2005, 2005 (14) : 2153 - 2163