Efficient Mining of Frequent Item Sets on Large Uncertain Databases

被引:55
|
作者
Wang, Liang [1 ]
Cheung, David Wai-Lok [1 ]
Cheng, Reynold [1 ]
Lee, Sau Dan [1 ]
Yang, Xuan S. [1 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
关键词
Frequent item sets; uncertain data set; approximate algorithm; incremental mining;
D O I
10.1109/TKDE.2011.165
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The data handled in emerging applications like location-based services, sensor monitoring systems, and data integration, are often inexact in nature. In this paper, we study the important problem of extracting frequent item sets from a large uncertain database, interpreted under the Possible World Semantics (PWS). This issue is technically challenging, since an uncertain database contains an exponential number of possible worlds. By observing that the mining process can be modeled as a Poisson binomial distribution, we develop an approximate algorithm, which can efficiently and accurately discover frequent item sets in a large uncertain database. We also study the important issue of maintaining the mining result for a database that is evolving (e. g., by inserting a tuple). Specifically, we propose incremental mining algorithms, which enable Probabilistic Frequent Item set (PFI) results to be refreshed. This reduces the need of re-executing the whole mining algorithm on the new database, which is often more expensive and unnecessary. We examine how an existing algorithm that extracts exact item sets, as well as our approximate algorithm, can support incremental mining. All our approaches support both tuple and attribute uncertainty, which are two common uncertain database models. We also perform extensive evaluation on real and synthetic data sets to validate our approaches.
引用
收藏
页码:2170 / 2183
页数:14
相关论文
共 50 条
  • [31] Comprehensive mining of frequent itemsets for a combination of certain and uncertain databases
    Wazir S.
    Beg M.M.S.
    Ahmad T.
    [J]. International Journal of Information Technology, 2020, 12 (4) : 1205 - 1216
  • [32] Efficient Mining of Frequent Patterns on Uncertain Graphs
    Chen, Yifan
    Zhao, Xiang
    Lin, Xuemin
    Wang, Yang
    Guo, Deke
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (02) : 287 - 300
  • [33] Incremental Technique with Set of Frequent Word Item sets for Mining Large Indonesian Text Data
    Maylawati, Dian Sa'adillah
    Ramdhani, Muhammad Ali
    Rahman, Ali
    Darmalaksana, Wahyudin
    [J]. 2017 5TH INTERNATIONAL CONFERENCE ON CYBER AND IT SERVICE MANAGEMENT (CITSM 2017), 2017, : 12 - 17
  • [34] Design and Implementation of Improved Algorithm for Frequent Item Sets Mining
    Zhang Lin
    Zhang Jianli
    [J]. PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1696 - 1698
  • [35] Method for Mining Frequent Item Sets Considering Average Utility
    Agarwal, Reshu
    Gautam, Arti
    Saksena, Ayush Kumar
    Rai, Amrita
    Karatangi, Shylaja VinayKumar
    [J]. 2021 INTERNATIONAL CONFERENCE ON EMERGING SMART COMPUTING AND INFORMATICS (ESCI), 2021, : 275 - 278
  • [36] Algorithm of Frequent Item Sets Mining Based on Index Table
    Zhang Lin
    Yao Nanzhen
    Zhang Jianli
    [J]. MECHATRONICS, ROBOTICS AND AUTOMATION, PTS 1-3, 2013, 373-375 : 1076 - +
  • [37] Image Classification Technology Based on Mining of Frequent Item sets
    Nie, Qing
    Zhan, Shou-yi
    Su, Jing-xia
    [J]. PROCEEDINGS OF THE 2008 CHINESE CONFERENCE ON PATTERN RECOGNITION (CCPR 2008), 2008, : 144 - +
  • [38] Parallel algorithm for mining frequent item sets based on Spark
    Mao, Yimin
    Wu, Bin
    Xu, Chundong
    Zhang, Maosheng
    [J]. Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (04): : 1267 - 1283
  • [39] An efficient algorithm for mining high utility itemsets with negative item values in large databases
    Chu, Chun-Jung
    Tseng, Vincent S.
    Liang, Tyne
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2009, 215 (02) : 767 - 778
  • [40] On Genetic Algorithms for Detecting Frequent Item Sets And Large Bite Sets
    Sizov, Roman A.
    Simovici, Dan A.
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 435 - 445