Differential Dependencies: Reasoning and Discovery

被引:73
|
作者
Song, Shaoxu [1 ]
Chen, Lei [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2011年 / 36卷 / 03期
关键词
Theory; Algorithms; Performance; Data dependencies; differential dependencies; FUNCTIONAL-DEPENDENCIES; ALGORITHM; DATABASES; TRENDS;
D O I
10.1145/2000824.2000826
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The importance of difference semantics (e. g., "similar" or "dissimilar") has been recently recognized for declaring dependencies among various types of data, such as numerical values or text values. We propose a novel form of Differential Dependencies (DDs), which specifies constraints on difference, called differential functions, instead of identification functions in traditional dependency notations like functional dependencies. Informally, a differential dependency states that if two tuples have distances on attributes X agreeing with a certain differential function, then their distances on attributes Y should also agree with the corresponding differential function on Y. For example, [date(<= 7)] -> [price(< 100)] states that the price difference of any two days within a week length should be no greater than 100 dollars. Such differential dependencies are useful in various applications, for example, violation detection, data partition, query optimization, record linkage, etc. In this article, we first address several theoretical issues of differential dependencies, including formal definitions of DDs and differential keys, subsumption order relation of differential functions, implication of DDs, closure of a differential function, a sound and complete inference system, and minimal cover for DDs. Then, we investigate a practical problem, that is, how to discover DDs and differential keys from a given dataset. Due to the intrinsic hardness, we develop several pruning methods to improve the discovery efficiency in practice. Finally, through an extensive experimental evaluation on real datasets, we demonstrate the discovery performance and the effectiveness of DDs in several real applications.
引用
收藏
页数:41
相关论文
共 50 条
  • [1] "Differential Dependencies: Reasoning and Discovery" Revisited
    Vincent, Millist
    Liu, Jixue
    Liu, Hong-Cheu
    Link, Sebastian
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2015, 40 (02):
  • [2] Efficient Discovery of Differential Dependencies Through Association Rules Mining
    Kwashie, Selasi
    Liu, Jixue
    Li, Jiuyong
    Ye, Feiyue
    [J]. DATABASES THEORY AND APPLICATIONS, 2015, 9093 : 3 - 15
  • [3] Reasoning About Embedded Dependencies Using Inclusion Dependencies
    Hannula, Miika
    [J]. LOGIC FOR PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND REASONING, (LPAR-20 2015), 2015, 9450 : 16 - 30
  • [4] Nested Dependencies: Structure and Reasoning
    Kolaitis, Phokion G.
    Pichler, Reinhard
    Sallinger, Emanuel
    Savenkov, Vadim
    [J]. PODS'14: PROCEEDINGS OF THE 33RD ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2014, : 176 - 187
  • [5] Parallel Reasoning of Graph Functional Dependencies
    Fan, Wenfei
    Liu, Xueli
    Cao, Yingjie
    [J]. 2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 593 - 604
  • [6] Graphical reasoning for sets of functional dependencies
    Demetrovics, J
    Molnár, A
    Thalheim, B
    [J]. CONCEPTUAL MODELING - ER 2004, PROCEEDINGS, 2004, 3288 : 166 - 179
  • [7] Discovery of Field Functional Dependencies
    Sun, Jizhou
    Li, Jianzhong
    Gao, Hong
    Liu, Xianmin
    [J]. 2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE), 2015, : 448 - 455
  • [8] Efficient Discovery of Matching Dependencies
    Schirmer, Philipp
    Papenbrock, Thorsten
    Koumarelas, Ioannis
    Naumann, Felix
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2020, 45 (03):
  • [9] Efficient Discovery of Approximate Dependencies
    Kruse, Sebastian
    Naumann, Felix
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (07): : 759 - 772
  • [10] A Incremental Discovery of Inclusion Dependencies
    Shaabani, Nuhad
    Meinel, Christoph
    [J]. SSDBM 2017: 29TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2017,