Characterizing Schema Mappings via Data Examples

被引:27
|
作者
Alexe, Bogdan [2 ]
Ten Cate, Balder [1 ]
Kolaitis, Phokion G. [1 ,2 ]
Tan, Wang-Chiew [1 ,2 ]
机构
[1] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
[2] IBM Res Almaden, San Jose, CA USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2011年 / 36卷 / 04期
基金
美国国家科学基金会;
关键词
Algorithms; Languages; Theory; Schema mappings; data examples; data exchange; data integration;
D O I
10.1145/2043652.2043656
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Schema mappings are high-level specifications that describe the relationship between two database schemas; they are considered to be the essential building blocks in data exchange and data integration, and have been the object of extensive research investigations. Since in real-life applications schema mappings can be quite complex, it is important to develop methods and tools for understanding, explaining, and refining schema mappings. A promising approach to this effect is to use "good" data examples that illustrate the schema mapping at hand. We develop a foundation for the systematic investigation of data examples and obtain a number of results on both the capabilities and the limitations of data examples in explaining and understanding schema mappings. We focus on schema mappings specified by source-to-target tuple generating dependencies (s-t tgds) and investigate the following problem: which classes of s-t tgds can be "uniquely characterized" by a finite set of data examples? Our investigation begins by considering finite sets of positive and negative examples, which are arguably the most natural choice of data examples. However, we show that they are not powerful enough to yield interesting unique characterizations. We then consider finite sets of universal examples, where a universal example is a pair consisting of a source instance and a universal solution for that source instance. We first show that unique characterizations via universal examples is, in a precise sense, equivalent to the existence of Armstrong bases (a relaxation of the classical notion of Armstrong databases). After this, we show that every schema mapping specified by LAV s-t tgds is uniquely characterized by a finite set of universal examples with respect to the class of LAV s-t tgds. Moreover, this positive result extends to the much broader classes of n-modular schema mappings, n a positive integer. Finally, we study the unique characterizability of GAV schema mappings. It turns out that some GAV schema mappings are uniquely characterizable by a finite set of universal examples with respect to the class of GAV s-t tgds, while others are not. By unveiling a tight connection with homomorphism dualities, we establish an effective, sound, and complete criterion for determining whether or not a GAV schema mapping is uniquely characterizable by a finite set of universal examples with respect to the class of GAV s-t tgds.
引用
收藏
页数:48
相关论文
共 50 条
  • [1] Characterizing Schema Mappings via Data Examples
    Alexe, Bogdan
    Kolaitis, Phokion G.
    Tan, Wang-Chiew
    [J]. PODS 2010: PROCEEDINGS OF THE TWENTY-NINTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2010, : 261 - 271
  • [2] EIRENE: Interactive Design and Refinement of Schema Mappings via Data Examples
    Alexe, Bogdan
    ten Cate, Balder
    Kolaitis, Phokion G.
    Tan, Wang-Chiew
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (12): : 1414 - 1417
  • [3] Schema Mappings and Data Examples: Deriving Syntax from Semantics
    Kolaitis, Phokion G.
    [J]. IARCS ANNUAL CONFERENCE ON FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE (FSTTCS 2011), 2011, 13 : 25 - 25
  • [4] Schema Mappings for Data Graphs
    Francis, Nadime
    Libkin, Leonid
    [J]. PODS'17: PROCEEDINGS OF THE 36TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2017, : 389 - 401
  • [5] Schema Mappings: Rules for Mixing Data
    Halevy, Alon
    [J]. COMMUNICATIONS OF THE ACM, 2010, 53 (01) : 100 - 100
  • [6] Executable schema mappings for statistical data processing
    Atzeni, Paolo
    Bellomarini, Luigi
    Bugiotti, Francesca
    De Leonardis, Marco
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2018, 36 (02) : 265 - 300
  • [7] Executable schema mappings for statistical data processing
    Paolo Atzeni
    Luigi Bellomarini
    Francesca Bugiotti
    Marco De Leonardis
    [J]. Distributed and Parallel Databases, 2018, 36 : 265 - 300
  • [8] Data-driven understanding and refinement of schema mappings
    Yan, LL
    Miller, RJ
    Haas, LM
    Fagin, R
    [J]. SIGMOD RECORD, 2001, 30 (02) : 485 - 496
  • [9] Schema exchange: Generic mappings for transforming data and metadata
    Papotti, Paolo
    Torlone, Riccardo
    [J]. DATA & KNOWLEDGE ENGINEERING, 2009, 68 (07) : 665 - 682
  • [10] XML Schema Mappings: Data Exchange and Metadata Management
    Amano, Shun'ichi
    David, Claire
    Libkin, Leonid
    Murlak, Filip
    [J]. JOURNAL OF THE ACM, 2014, 61 (02)