Functional Dependencies Unleashed for Scalable Data Exchange

被引:3
|
作者
Bonifati, Angela [1 ,3 ,4 ]
Ileana, Ioana [2 ]
Linardi, Michele [2 ,4 ]
机构
[1] Univ Lyon 1, Villeurbanne, France
[2] Paris Descartes Univ, Paris, France
[3] Univ Lille, Lille, France
[4] INRIA, Le Chesnay, France
来源
28TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM) 2016) | 2016年
关键词
Chase; functional dependencies; parallelization;
D O I
10.1145/2949689.2949698
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We address the problem of efficiently evaluating target functional dependencies (fds) in the Data Exchange (DE) process. Target fds naturally occur in many DE scenarios, including the ones in Life Sciences in which multiple source relations need to be structured under a constrained target schema. However, despite their wide use, target fds' evaluation is still a bottleneck in the state-of-the-art DE engines. Systems relying on an all-SQL approach typically do not support target fds unless additional information is provided. Alternatively, DE engines that do include these dependencies typically pay the price of a significant drop in performance and scalability. In this paper, we present a novel chase-based algorithm that can efficiently handle arbitrary fds on the target. Our approach essentially relies on exploiting the interactions between source-to-target (s-t) tuple-generating dependencies (tgds) and target fds. This allows us to tame the size of the intermediate chase results, by playing on a careful ordering of chase steps interleaving fds and (chosen) tgds. As a direct consequence, we importantly diminish the fd application scope, often a central cause of the dramatic overhead induced by target fds. Moreover, reasoning on dependency interaction further leads us to interesting parallelization opportunities, yielding additional scalability gains. We provide a proof-of-concept implementation of our chase-based algorithm and an experimental study aimed at gauging its scalability and efficiency. Finally, we empirically compare with the latest DE engines, and show that our algorithm outperforms them.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Microsoft exchange server 2003 unleashed.
    Gordon, RS
    LIBRARY JOURNAL, 2006, 131 (01) : 146 - 146
  • [42] Scalable Communication Protocols for Dynamic Sparse Data Exchange
    Hoefler, Torsten
    Siebert, Christian
    Lumsdaine, Andrew
    PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2010, : 159 - 168
  • [43] Scalable Communication Protocols for Dynamic Sparse Data Exchange
    Hoefler, Torsten
    Siebert, Christian
    Lumsdaine, Andrew
    ACM SIGPLAN NOTICES, 2010, 45 (05) : 159 - 168
  • [44] Spatio-Temporal Functional Dependencies for Sensor Data Streams
    Charfi, Manel
    Gripay, Yann
    Petit, Jean-Marc
    ADVANCES IN SPATIAL AND TEMPORAL DATABASES, SSTD 2017, 2017, 10411 : 182 - 199
  • [45] Semandaq: A Data Quality System Based on Conditional Functional Dependencies
    Fan, Wenfei
    Geerts, Floris
    Jia, Xibei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1460 - 1463
  • [46] Automated Inference with Fuzzy Functional Dependencies over Graded Data
    Manuel Rodriguez-Jimenez, Jose
    Cordero, Pablo
    Enciso, Manuel
    Mora, Angel
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, PT II, 2013, 7903 : 254 - 265
  • [47] Discovering Functional Dependencies from Mixed-Type Data
    Mandros, Panagiotis
    Kaltenpoth, David
    Boley, Mario
    Vreeken, Jilles
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1404 - 1414
  • [48] Bayes Performance of Batch Data Mining Based on Functional Dependencies
    Xi, Haixu
    Ye, Feiyue
    He, Sheng
    Liu, Yijun
    Jiang, Hongfen
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (03)
  • [49] FOX: Inference of approximate functional dependencies from XML data
    Fassetti, Fabio
    Fazzinga, Bettina
    DEXA 2007: 18TH INTERNATIONAL CONFERENCE ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2007, : 10 - +
  • [50] Privacy-Preserving Publishing Data with Full Functional Dependencies
    Wang, Hui
    Liu, Ruilin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, PROCEEDINGS, 2010, 5982 : 176 - 183