Summarizing Uncertain Transaction Databases by Probabilistic Tiles

被引:0
|
作者
Liu, Chunyang [1 ]
Chen, Ling [1 ]
机构
[1] Univ Technol, Ctr Quantum Computat & Intelligent Syst, Sydney, NSW, Australia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transaction data mining is ubiquitous in various domains and has been researched extensively. In recent years, observing that uncertainty is inherent in many real world applications, uncertain data mining has attracted much research attention. Among the research problems, summarization is important because it produces concise and informative results, which facilitates further analysis. However, there are few works exploring how to effectively summarize uncertain transaction data. In this paper, we formulate the problem of summarizing uncertain transaction data as Minimal Probabilistic Tile Cover Mining, which aims to find a high-quality probabilistic tile set covering an uncertain database with minimal cost. We define the concept of Probabilistic Price and Probabilistic Price Order to evaluate and compare the quality of tiles, and propose a framework to discover the minimal probabilistic tile cover. The bottleneck is to check whether a tile is better than another according to the Probabilistic Price Order, which involves the computation of a joint probability. We prove that it can be decomposed into independent terms and calculated efficiently. Several optimization techniques are devised to further improve the performance. Experimental results on real world datasets demonstrate the conciseness of the produced tiles and the effectiveness and efficiency of our approach.
引用
收藏
页码:4375 / 4382
页数:8
相关论文
共 50 条
  • [31] Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases
    Li, Haifeng
    Zhang, Yuejin
    Zhang, Ning
    5TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2017, 2017, 122 : 1124 - 1132
  • [32] Supporting Uncertain Predicates in DBMS Using Approximate String Matching and Probabilistic Databases
    Jumde, Amol S.
    Keskar, Ravindra B.
    IEEE ACCESS, 2020, 8 : 169070 - 169081
  • [33] Distributed probabilistic top-k dominating queries over uncertain databases
    Niranjan Rai
    Xiang Lian
    Knowledge and Information Systems, 2023, 65 : 4939 - 4965
  • [34] Range-constrained probabilistic mutual furthest neighbor queries in uncertain databases
    Bavi, Kovan
    Lian, Xiang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (06) : 2375 - 2402
  • [35] Range-constrained probabilistic mutual furthest neighbor queries in uncertain databases
    Kovan Bavi
    Xiang Lian
    Knowledge and Information Systems, 2023, 65 : 2375 - 2402
  • [36] Distributed probabilistic top-k dominating queries over uncertain databases
    Rai, Niranjan
    Lian, Xiang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (11) : 4939 - 4965
  • [37] BigSAM: Mining Interesting Patterns from Probabilistic Databases of Uncertain Big Data
    Jiang, Fan
    Leung, Carson Kai-Sang
    MacKinnon, Richard Kyle
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 : 780 - 792
  • [38] Top-K Probabilistic Closest Pairs Query in Uncertain Spatial Databases
    Chen, Mo
    Jia, Zixi
    Gu, Yu
    Yu, Ge
    WEB TECHNOLOGIES AND APPLICATIONS, 2011, 6612 : 53 - 64
  • [39] Mining frequent subgraphs over uncertain graph databases under probabilistic semantics
    Li, Jianzhong
    Zou, Zhaonian
    Gao, Hong
    VLDB JOURNAL, 2012, 21 (06): : 753 - 777
  • [40] Guest Editors' Introduction: Special Section on Mining Large Uncertain and Probabilistic Databases
    Cheng, Reynold
    Chau, Michael
    Garofalakis, Minos
    Yu, Jeffrey Xu
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (09) : 1201 - 1202