Functional Dependencies Unleashed for Scalable Data Exchange

被引:3
|
作者
Bonifati, Angela [1 ,3 ,4 ]
Ileana, Ioana [2 ]
Linardi, Michele [2 ,4 ]
机构
[1] Univ Lyon 1, Villeurbanne, France
[2] Paris Descartes Univ, Paris, France
[3] Univ Lille, Lille, France
[4] INRIA, Le Chesnay, France
来源
28TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM) 2016) | 2016年
关键词
Chase; functional dependencies; parallelization;
D O I
10.1145/2949689.2949698
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We address the problem of efficiently evaluating target functional dependencies (fds) in the Data Exchange (DE) process. Target fds naturally occur in many DE scenarios, including the ones in Life Sciences in which multiple source relations need to be structured under a constrained target schema. However, despite their wide use, target fds' evaluation is still a bottleneck in the state-of-the-art DE engines. Systems relying on an all-SQL approach typically do not support target fds unless additional information is provided. Alternatively, DE engines that do include these dependencies typically pay the price of a significant drop in performance and scalability. In this paper, we present a novel chase-based algorithm that can efficiently handle arbitrary fds on the target. Our approach essentially relies on exploiting the interactions between source-to-target (s-t) tuple-generating dependencies (tgds) and target fds. This allows us to tame the size of the intermediate chase results, by playing on a careful ordering of chase steps interleaving fds and (chosen) tgds. As a direct consequence, we importantly diminish the fd application scope, often a central cause of the dramatic overhead induced by target fds. Moreover, reasoning on dependency interaction further leads us to interesting parallelization opportunities, yielding additional scalability gains. We provide a proof-of-concept implementation of our chase-based algorithm and an experimental study aimed at gauging its scalability and efficiency. Finally, we empirically compare with the latest DE engines, and show that our algorithm outperforms them.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Preserving logical and functional dependencies in synthetic tabular data
    Umesh, Chaithra
    Schultz, Kristian
    Mahendra, Manjunath
    Bej, Saptarshi
    Wolkenhauer, Olaf
    PATTERN RECOGNITION, 2025, 163
  • [32] Cardinality constraints and functional dependencies over possibilistic data
    Roblot, Tania
    Link, Sebastian
    DATA & KNOWLEDGE ENGINEERING, 2018, 117 : 339 - 358
  • [33] On the Existence of Armstrong Data Trees for XML Functional Dependencies
    Hartmann, Sven
    Koehler, Henning
    Trinh, Thu
    FOUNDATIONS OF INFORMATION AND KNOWLEDGE SYSTEMS, PROCEEDINGS, 2010, 5956 : 94 - +
  • [34] Discovering Quantitative Temporal Functional Dependencies on Clinical Data
    Combi, Carlo
    Mantovani, Matteo
    Sala, Pietro
    2017 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2017, : 248 - 257
  • [35] Chorus functional dependencies derived from CRRES data
    Spasojevic, M.
    Shprits, Y. Y.
    GEOPHYSICAL RESEARCH LETTERS, 2013, 40 (15) : 3793 - 3797
  • [36] Mining fuzzy functional dependencies from quantitative data
    Wang, SL
    Shen, JW
    Hong, TP
    SMC 2000 CONFERENCE PROCEEDINGS: 2000 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOL 1-5, 2000, : 3600 - 3605
  • [37] Functional dependencies are helpful for partial materialization of data cubes
    Garnaud, Eve
    Maabout, Sofian
    Mosbah, Mohamed
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2015, 73 (1-2) : 245 - 274
  • [38] Repairs and consistent answers for XML data with functional dependencies
    Flesca, S
    Furfaro, F
    Greco, S
    Zumpano, E
    DATABASE AND XML TECHNOLOGIES, 2003, 2824 : 238 - 253
  • [39] A Statistical Perspective on Discovering Functional Dependencies in Noisy Data
    Zhang, Yunjia
    Guo, Zhihan
    Rekatsinas, Theodoros
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 861 - 876
  • [40] Functional dependencies are helpful for partial materialization of data cubes
    Eve Garnaud
    Sofian Maabout
    Mohamed Mosbah
    Annals of Mathematics and Artificial Intelligence, 2015, 73 : 245 - 274