Weighted frequent itemset mining over uncertain databases

被引:46
|
作者
Lin, Jerry Chun-Wei [1 ]
Gan, Wensheng [1 ]
Fournier-Viger, Philippe [2 ]
Hong, Tzung-Pei [3 ,4 ]
Tseng, Vincent S. [5 ]
机构
[1] Harbin Inst Technol, Shenzhen Grad Sch, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China
[2] Univ Moncton, Dept Comp Sci, Moncton, NB E1A 3E9, Canada
[3] Natl Univ Kaohsiung, Dept Comp Sci & Informat Engn, Kaohsiung, Taiwan
[4] Natl Sun Yat Sen Univ, Dept Comp Sci & Engn, Kaohsiung 80424, Taiwan
[5] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan
关键词
Data mining; Uncertain databases; Weighted frequent itemsets; Two-phase; Upper-bound; SEQUENTIAL PATTERNS; ALGORITHM;
D O I
10.1007/s10489-015-0703-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent itemset mining (FIM) is a fundamental research topic, which consists of discovering useful and meaningful relationships between items in transaction databases. However, FIM suffers from two important limitations. First, it assumes that all items have the same importance. Second, it ignores the fact that data collected in a real-life environment is often inaccurate, imprecise, or incomplete. To address these issues and mine more useful and meaningful knowledge, the problems of weighted and uncertain itemset mining have been respectively proposed, where a user may respectively assign weights to items to specify their relative importance, and specify existential probabilities to represent uncertainty in transactions. However, no work has addressed both of these issues at the same time. In this paper, we address this important research problem by designing a new type of patterns named high expected weighted itemset (HEWI) and the HEWI-Uapriori algorithm to efficiently discover HEWIs. The HEWI-Uapriori finds HEWIs using an Apriori-like two-phase approach. The algorithm introduces a property named high upper-bound expected weighted downward closure (HUBEWDC) to early prune the search space and unpromising itemsets. Substantial experiments on real-life and synthetic datasets are conducted to evaluate the performance of the proposed algorithm in terms of runtime, memory consumption, and number of patterns found. Results show that the proposed algorithm has excellent performance and scalability compared with traditional methods for weighted-itemset mining and uncertain itemset mining.
引用
收藏
页码:232 / 250
页数:19
相关论文
共 50 条
  • [41] AT-Mine: An Efficient Algorithm of Frequent Itemset Mining on Uncertain Dataset
    Wang, Le
    Feng, Lin
    Wu, Mingfei
    JOURNAL OF COMPUTERS, 2013, 8 (06) : 1417 - 1426
  • [42] An Improved Vertical Algorithm for Frequent Itemset Mining from Uncertain Database
    Yang, Junrui
    Zhang, Yingjie
    Wei, Yanjun
    2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 355 - 358
  • [43] Frequent Itemsets Mining on Weighted Uncertain Data
    Alharbi, Manal
    Pathak, Sudipta
    Rajasekaran, Sanguthevar
    2014 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2014, : 201 - 206
  • [44] Mining Top-k Minimal Redundancy Frequent Patterns over Uncertain Databases
    Wang, Haishuai
    Zhang, Peng
    Wu, Jia
    Pan, Shirui
    NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 111 - 119
  • [45] Frequent Sequence Mining with Weight Constraints in Uncertain Databases
    Rahman, Md Mahmudur
    Ahmed, Chowdhury F.
    Leung, Carson K.
    Pazdor, Adam G. M.
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2018), 2018,
  • [46] Interactive Mining of Probabilistic Frequent Patterns in Uncertain Databases
    Lin, Ming-Yen
    Fu, Cheng-Tai
    Hsueh, Sue-Chen
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2022, 30 (02) : 263 - 283
  • [47] Mining Probabilistic Frequent Closed Itemsets in Uncertain Databases
    Tang, Peiyi
    Peterson, Erich A.
    PROCEEDINGS OF THE 49TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE (ACMSE '11), 2011, : 86 - 91
  • [48] Inverted Index Automata Frequent Itemset Mining for Large Dataset Frequent Itemset Mining
    Dai, Xin
    Hamed, Haza Nuzly Abdull
    Su, Qichen
    Hao, Xue
    IEEE ACCESS, 2024, 12 : 195111 - 195130
  • [49] WFIM: Weighted Frequent Itemset Mining with a weight range and a minimum weight
    Yun, Unil
    Leggett, John J.
    PROCEEDINGS OF THE FIFTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2005, : 636 - 640
  • [50] A Weighted Frequent Itemset Mining Algorithm for Intelligent Decision in Smart Systems
    Zhao, Xuejian
    Zhang, Xinhui
    Wang, Pan
    Chen, Songle
    Sun, Zhixin
    IEEE ACCESS, 2018, 6 : 29271 - 29282