A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives

被引:58
|
作者
Lee, Gangin [1 ]
Yun, Unil [1 ]
机构
[1] Sejong Univ, Dept Comp Engn, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Correctness; Data mining; Existential probability; Frequent pattern mining; Uncertain pattern; SEQUENTIAL PATTERNS; ALGORITHM;
D O I
10.1016/j.future.2016.09.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The concept of uncertain pattern mining was recently proposed to fulfill the demand for processing databases with uncertain data, and various relevant methods have been devised. However, previous approaches have the following limitations. State-of-the-art methods based on tree structure can cause fatal problems in terms of runtime and memory usage according to the characteristics of uncertain databases and threshold settings because their own tree data structures can become excessively large and complicated in their mining processes. Various approximation approaches have been suggested in order to overcome such problems; however, they are methods that increase their own mining performance at the cost of accuracy of the mining results. In order to solve the problems, we propose an exact, efficient algorithm for mining uncertain frequent patterns based on novel data structures and mining techniques, which can also guarantee the correctness of the mining results without any false positives. The newly proposed list-based data structures and pruning techniques allow a complete set of uncertain frequent patterns to be mined more efficiently without pattern losses. We also demonstrate that the proposed algorithm outperforms previous state-of-the art approaches in both theoretical and empirical aspects. Especially, we provide analytical results of performance evaluation for various types of datasets to show efficiency of runtime, memory usage, and scalability in our method. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:89 / 110
页数:22
相关论文
共 50 条
  • [21] A data mining proxy approach for efficient frequent itemset mining
    Yu, Jeffrey Xu
    Li, Zhiheng
    Liu, Guimei
    [J]. VLDB JOURNAL, 2008, 17 (04): : 947 - 970
  • [22] A data mining proxy approach for efficient frequent itemset mining
    Jeffrey Xu Yu
    Zhiheng Li
    Guimei Liu
    [J]. The VLDB Journal, 2008, 17 : 947 - 970
  • [23] Incremental mining maximal frequent patterns from univariate uncertain data
    Fasihy, Hanieh
    Shahraki, Mohammad Hossein Nadimi
    [J]. KNOWLEDGE-BASED SYSTEMS, 2018, 152 : 40 - 50
  • [24] An efficient approach for mining weighted uncertain interesting patterns
    Nguyen, Ham
    Vo, Dang
    Bui, Huong
    Le, Tuong
    Vo, Bay
    [J]. INFORMATION SCIENCES, 2022, 615 : 1 - 23
  • [25] Efficient Mining Frequent Closed Resource Patterns in Resource Effectiveness Data: The MFPattern Approach
    Zhang, Lihua
    Wang, Miao
    Gu, Qingfan
    Zhai, Zhengjun
    Wang, Guoqing
    [J]. PROCEEDINGS OF THE FIRST SYMPOSIUM ON AVIATION MAINTENANCE AND MANAGEMENT-VOL II, 2014, 297 : 31 - 41
  • [26] Efficient Probabilistic Frequent Itemset Mining in Big Sparse Uncertain Data
    Xu, Jing
    Li, Ning
    Mao, Xiao-Jiao
    Yang, Yu-Bin
    [J]. PRICAI 2014: TRENDS IN ARTIFICIAL INTELLIGENCE, 2014, 8862 : 235 - 247
  • [27] Efficient algorithm for frequent pattern mining over uncertain data streams
    Du, Congqiang
    Shao, Zengzhen
    [J]. Journal of Computational Information Systems, 2015, 11 (21): : 7799 - 7808
  • [28] Mining top-k frequent patterns without minimum support threshold
    Salam, Abdus
    Khayal, M. Sikandar Hayat
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 30 (01) : 57 - 86
  • [29] Mining top-k frequent closed patterns without minimum support
    Han, JW
    Wang, JY
    Lu, Y
    Tzvetkov, P
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 211 - 218
  • [30] An efficient approach for outlier detection from uncertain data streams based on maximal frequent patterns
    Cai, Saihua
    Li, Li
    Li, Sicong
    Sun, Ruizhi
    Yuan, Gang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160