Computing exact permutation p-values for association rules

被引:9
|
作者
Wu, Jun [1 ]
He, Zengyou [1 ,2 ]
Gu, Feiyang [1 ]
Liu, Xiaoqing [1 ]
Zhou, Jianyu [1 ]
Yang, Can [3 ]
机构
[1] Dalian Univ Technol, Sch Software, Dalian, Liaoning, Peoples R China
[2] Dalian Univ Technol, Key Lab Ubiquitous Network & Serv Software Liaoni, Dalian, Liaoning, Peoples R China
[3] Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
关键词
Association rule mining; Statistical significance testing; Permutation testing; Exact permutation p-value;
D O I
10.1016/j.ins.2016.01.094
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Association rule mining is an important task in the field of data mining, and many efficient algorithms have been proposed to address this problem. However, a large portion of the rules reported by these algorithms just satisfy the user-defined constraints purely by accident, and those that are not statistically meaningful should be filtered out through statistical significance testing. In the context of association rule discovery, the permutation based approach can achieve better performance than other competitive methods, although several drawbacks of this effective approach narrow its usability. In this paper, we provide an analysis of these disadvantages and propose an algorithm called Exact Permutation p-values for Association Rules (EPAR) to calculate the exact p-values of all tested rules. Experiments on different types of data sets demonstrate that EPAR can successfully alleviate the disadvantages and outperform the direct permutation-based method over several performance measures. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:146 / 162
页数:17
相关论文
共 50 条
  • [1] Computing exact P-values for community detection
    He, Zengyou
    Liang, Hao
    Chen, Zheng
    Zhao, Can
    Liu, Yan
    DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (03) : 833 - 869
  • [2] Computing exact P-values for DNA motifs
    Zhang, Jing
    Jiang, Bo
    Li, Ming
    Tromp, John
    Zhang, Xuegong
    Zhang, Michael Q.
    BIOINFORMATICS, 2007, 23 (05) : 531 - 537
  • [3] Computing exact P-values for community detection
    Zengyou He
    Hao Liang
    Zheng Chen
    Can Zhao
    Yan Liu
    Data Mining and Knowledge Discovery, 2020, 34 : 833 - 869
  • [4] Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn
    Phipson, Belinda
    Smyth, Gordon K.
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2010, 9 (01)
  • [5] P-VALUES EXACT
    JAGDIS, F
    ANNALS OF INTERNAL MEDICINE, 1986, 105 (04) : 641 - 642
  • [6] Computing P-values for a class of permutation tests of equal survival functions
    Dallas, MJ
    Rao, PV
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2003, 71 (02) : 149 - 153
  • [7] Computing highly accurate or exact P-values using importance sampling
    Lloyd, Chris J.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) : 1784 - 1794
  • [8] Exact p-Values for Network Interference
    Athey, Susan
    Eckles, Dean
    Imbens, Guido W.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (521) : 230 - 240
  • [9] Recommendation to use exact P-values in biomarker discovery research in place of approximate P-values
    Buas, Matthew F.
    Li, Christopher, I
    Anderson, Garnet L.
    Pepe, Margaret S.
    CANCER EPIDEMIOLOGY, 2018, 56 : 83 - 89
  • [10] P-values from permutation and F-tests
    Routledge, RD
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1997, 24 (04) : 379 - 386