Correlated pattern mining in quantitative databases

被引:15
|
作者
Ke, Yiping [1 ]
Cheng, James [1 ]
Ng, Wilfred [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2008年 / 33卷 / 03期
关键词
algorithms; quantitative databases; correlated patterns; information-theoretic approach; mutual information;
D O I
10.1145/1386118.1386120
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study mining correlations from quantitative databases and show that this is a more effective approach than mining associations to discover useful patterns. We propose the novel notion of quantitative correlated pattern (QCP), which is founded on two formal concepts, mutual information and all-confidence. We first devise a normalization on mutual information and apply it to the problem of QCP mining to capture the dependency between the attributes. We further adopt all-confidence as a quality measure to ensure, at a finer granularity, the dependency between the attributes with specific quantitative intervals. We also propose an effective supervised method that combines the consecutive intervals of the quantitative attributes based on mutual information, such that the interval-combining is guided by the dependency between the attributes. We develop an algorithm, QCoMine, to mine QCPs efficiently by utilizing normalized mutual information and all-confidence to perform bilevel pruning. We also identify the redundancy existing in the set of QCPs and propose effective techniques to eliminate the redundancy. Our extensive experiments on both real and synthetic datasets verify the efficiency of QCoMine and the quality of the QCPs. The experimental results also justify the effectiveness of our proposed techniques for redundancy elimination. To further demonstrate the usefulness and the quality of QCPs, we study an application of QCPs to classification. We demonstrate that the classifier built on the QCPs achieves higher classification accuracy than the state-of-the-art classifiers built on association rules.
引用
收藏
页数:45
相关论文
共 50 条
  • [31] Fuzzy concept association rules in data mining of quantitative databases
    Liu, SY
    Chen, LC
    Liu, CY
    ISTM/2003: 5TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-6, CONFERENCE PROCEEDINGS, 2003, : 967 - 969
  • [32] An efficient algorithm for mining quantitative association rules in large databases
    Lee, HJ
    Park, WH
    Song, SJ
    Park, DS
    IKE'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2003, : 571 - 576
  • [33] Mining frequent weighted utility itemsets in hierarchical quantitative databases
    Nguyen, Ham
    Le, Tuong
    Nguyen, Minh
    Fournier-Viger, Philippe
    Tseng, Vincent S. S.
    Vo, Bay
    KNOWLEDGE-BASED SYSTEMS, 2022, 237
  • [34] Fuzzy Maximal Frequent Itemset Mining Over Quantitative Databases
    Li, Haifeng
    Wang, Yue
    Zhang, Ning
    Zhang, Yuejin
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2017, PT I, 2017, 10191 : 476 - 486
  • [35] Mining Top-K Frequent Correlated Subgraph Pairs in Graph Databases
    Shang, Li
    Jian, Yujiao
    INTELLIGENT INFORMATICS, 2013, 182 : 1 - 8
  • [36] Activity Recognition using Correlated Pattern Mining for People with Dementia
    Sim, Kelvin
    Phua, Clifton
    Yap, Ghim-Eng
    Biswas, Jit
    Mokhtari, Mounir
    2011 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2011, : 7593 - 7597
  • [37] An Efficient Frequent Pattern Mining Method and its Parallelization in Transactional Databases
    Fakhrahmad, S. M.
    Dastghaibyfard, Gh.
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2011, 27 (02) : 511 - 525
  • [38] LCGMiner: Levelwise closed graph pattern mining from large databases
    Xu, AH
    Lei, HS
    16TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2004, : 421 - 422
  • [39] Sequential pattern mining in multi-databases via multiple alignment
    Kum, HC
    Chang, JH
    Wang, W
    DATA MINING AND KNOWLEDGE DISCOVERY, 2006, 12 (2-3) : 151 - 180
  • [40] Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) : 1708 - 1721