Mining significant association rules from uncertain data

被引:8
|
作者
Zhang, Anshu [1 ]
Shi, Wenzhong [1 ]
Webb, Geoffrey I. [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Land Surveying & Geoinformat, Kowloon, Hong Kong, Peoples R China
[2] Monash Univ, Fac Informat Technol, Melbourne, Vic 3800, Australia
关键词
Pattern discovery; Association rules; Statistical evaluation; Uncertain data; ACCURACY;
D O I
10.1007/s10618-015-0446-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In association rule mining, the trade-off between avoiding harmful spurious rules and preserving authentic ones is an ever critical barrier to obtaining reliable and useful results. The statistically sound technique for evaluating statistical significance of association rules is superior in preventing spurious rules, yet can also cause severe loss of true rules in presence of data error. This study presents a new and improved method for statistical test on association rules with uncertain erroneous data. An original mathematical model was established to describe data error propagation through computational procedures of the statistical test. Based on the error model, a scheme combining analytic and simulative processes was designed to correct the statistical test for distortions caused by data error. Experiments on both synthetic and real-world data show that the method significantly recovers the loss in true rules (reduces type-2 error) due to data error occurring in original statistically sound method. Meanwhile, the new method maintains effective control over the familywise error rate, which is the distinctive advantage of the original statistically sound technique. Furthermore, the method is robust against inaccurate data error probability information and situations not fulfilling the commonly accepted assumption on independent error probabilities of different data items. The method is particularly effective for rules which were most practically meaningful yet sensitive to data error. The method proves promising in enhancing values of association rule mining results and helping users make correct decisions.
引用
收藏
页码:928 / 963
页数:36
相关论文
共 50 条
  • [41] Mining association rules in big data with NGEP
    Yunliang Chen
    Fangyuan Li
    Junqing Fan
    [J]. Cluster Computing, 2015, 18 : 577 - 585
  • [42] Association rules for web data mining in WHOWEDA
    Madria, SK
    Raymond, C
    Bhowmick, S
    Mohania, M
    [J]. 2000 KYOTO INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES: RESEARCH AND PRACTICE, PROCEEDINGS, 2000, : 227 - 233
  • [43] Mining Association Rules in Seasonal Transaction Data
    Ayu, Sabrina Kusuma
    Surjandari, Isti
    Zulkarnain
    [J]. 2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 321 - 325
  • [44] MINING ASSOCIATION RULES FOR TRAJECTORIES OF SPATIOTEMPORAL DATA
    Hong, Hao
    [J]. 2011 INTERNATIONAL CONFERENCE ON INSTRUMENTATION, MEASUREMENT, CIRCUITS AND SYSTEMS (ICIMCS 2011), VOL 3: COMPUTER-AIDED DESIGN, MANUFACTURING AND MANAGEMENT, 2011, : 247 - 253
  • [45] Mining association rules in big data with NGEP
    Chen, Yunliang
    Li, Fangyuan
    Fan, Junqing
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02): : 577 - 585
  • [46] Improvements in data mining association rules algorithm
    Li, Dai
    [J]. International Journal of Database Theory and Application, 2015, 8 (02): : 1 - 10
  • [47] A framework for mining association rules in data warehouses
    Tjioe, HC
    Taniar, D
    [J]. INTELLIGENT DAA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 159 - 165
  • [48] Mining Association Rules from Stream Data Based on the Dynamic Support
    Luo, Jia
    Chen, Shihe
    Pan, Fengping
    Zhu, Yaqin
    Wu, Le
    Sun, Yaqi
    Zhang, Chunkai
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS, 2016, 127
  • [49] Post-Mining of Generalized Association Rules from Data Cubes
    Brahmi, Hanen
    [J]. 33RD INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2019), 2019, : 153 - 158
  • [50] Efficient mining of salinity and temperature association rules from ARGO data
    Huang, Yo-Ping
    Kao, Li-Jen
    Sandnes, Frode-Eika
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (1-2) : 59 - 68