Summarizing Uncertain Transaction Databases by Probabilistic Tiles

被引:0
|
作者
Liu, Chunyang [1 ]
Chen, Ling [1 ]
机构
[1] Univ Technol, Ctr Quantum Computat & Intelligent Syst, Sydney, NSW, Australia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transaction data mining is ubiquitous in various domains and has been researched extensively. In recent years, observing that uncertainty is inherent in many real world applications, uncertain data mining has attracted much research attention. Among the research problems, summarization is important because it produces concise and informative results, which facilitates further analysis. However, there are few works exploring how to effectively summarize uncertain transaction data. In this paper, we formulate the problem of summarizing uncertain transaction data as Minimal Probabilistic Tile Cover Mining, which aims to find a high-quality probabilistic tile set covering an uncertain database with minimal cost. We define the concept of Probabilistic Price and Probabilistic Price Order to evaluate and compare the quality of tiles, and propose a framework to discover the minimal probabilistic tile cover. The bottleneck is to check whether a tile is better than another according to the Probabilistic Price Order, which involves the computation of a joint probability. We prove that it can be decomposed into independent terms and calculated efficiently. Several optimization techniques are devised to further improve the performance. Experimental results on real world datasets demonstrate the conciseness of the produced tiles and the effectiveness and efficiency of our approach.
引用
收藏
页码:4375 / 4382
页数:8
相关论文
共 50 条
  • [21] On efficiently summarizing categorical databases
    Wang, JY
    Karypis, G
    KNOWLEDGE AND INFORMATION SYSTEMS, 2006, 9 (01) : 19 - 37
  • [22] On efficiently summarizing categorical databases
    Jianyong Wang
    George Karypis
    Knowledge and Information Systems, 2006, 9 : 19 - 37
  • [23] Efficient processing of probabilistic group subspace skyline queries in uncertain databases
    Lian, Xiang
    Chen, Lei
    INFORMATION SYSTEMS, 2013, 38 (03) : 265 - 285
  • [24] Probabilistic maximal frequent itemset mining methods over uncertain databases
    Li, Haifeng
    Hai, Mo
    Zhang, Ning
    Zhu, Jianming
    Wang, Yue
    Cao, Huaihu
    INTELLIGENT DATA ANALYSIS, 2019, 23 (06) : 1219 - 1241
  • [25] Probabilistic top-k range query processing for uncertain databases
    Xiao, Guoqing
    Wu, Fan
    Zhou, Xu
    Li, Keqin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 1109 - 1120
  • [26] Probabilistic Frequent Itemset Mining Algorithm over Uncertain Databases with Sampling
    Li, Hai-Feng
    Zhang, Ning
    Zhang, Yue-Jin
    Wang, Yue
    FUZZY SYSTEMS AND DATA MINING II, 2016, 293 : 159 - 166
  • [27] Summarizing data for secure transaction: A hash algorithm
    Basar, Mehmet Sinan
    AFRICAN JOURNAL OF BUSINESS MANAGEMENT, 2011, 5 (34): : 13211 - 13216
  • [28] Summarizing transactional databases with overlapped hyperrectangles
    Xiang, Yang
    Jin, Ruoming
    Fuhry, David
    Dragan, Feodor F.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 23 (02) : 215 - 251
  • [29] Summarizing transactional databases with overlapped hyperrectangles
    Yang Xiang
    Ruoming Jin
    David Fuhry
    Feodor F. Dragan
    Data Mining and Knowledge Discovery, 2011, 23 : 215 - 251
  • [30] A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases
    Bernecker, Thomas
    Emrich, Tobias
    Kriegel, Hans-Peter
    Mamoulis, Nikos
    Renz, Matthias
    Zuefle, Andreas
    IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 339 - 350