Towards Integrating Workflow and Database Provenance

被引:0
|
作者
Chirigati, Fernando [1 ]
Freire, Juliana [1 ]
机构
[1] NYU, Polytech Inst, Comp Sci & Engn Dept, New York, NY 10003 USA
关键词
Workflow Provenance; Database Provenance; Reproducibility;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While there has been substantial work on both database and workflow provenance, the two problems have only been examined in isolation. It is widely accepted that the existing models are incompatible. Database provenance is fine-grained and captures changes to tuples in a database. In contrast, workflow provenance is represented at a coarser level and reflects the functional model of workflow systems, which is stateless-each computational step derives a new artifact. In this paper, we propose a new approach to combine database and workflow provenance. We address the mismatch between the different kinds of provenance by using a temporal model which explicitly represents the database states as updates are applied. We discuss how, under this model, reproducibility is obtained for workflows that manipulate databases, and how different queries that straddle the two provenance traces can be evaluated. We also describe a proof-of-concept implementation that integrates a workflow system and a commercial relational database.
引用
收藏
页码:11 / 23
页数:13
相关论文
共 50 条
  • [31] The Semiring Framework for Database Provenance
    Green, Todd J.
    Tannen, Val
    PODS'17: PROCEEDINGS OF THE 36TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2017, : 93 - 99
  • [32] PDiffView: Viewing the Difference in Provenance of Workflow Results
    Bao, Zhuowei
    Cohen-Boulakia, Sarah
    Davidson, Susan B.
    Girard, Pierrick
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02): : 1638 - 1641
  • [33] Challenges of Provenance in Scientific Workflow Management Systems
    Alam, Khairul
    Roy, Banani
    2022 IEEE/ACM WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE, WORKS, 2022, : 10 - 18
  • [34] Mechanisms for provenance collection in scientific workflow systems
    Mehdi Sarikhani
    Andrew Wendelborn
    Computing, 2018, 100 : 439 - 472
  • [35] A workflow modeling system for capturing data provenance
    Joglekar, Girish S.
    Giridhar, Arun
    Reklaitis, Gintaras
    COMPUTERS & CHEMICAL ENGINEERING, 2014, 67 : 148 - 158
  • [36] Managing data provenance in database
    Liu, Xiping
    Wan, Changxuan
    Jiang, Tengjiao
    Journal of Information and Computational Science, 2009, 6 (01): : 423 - 431
  • [37] Provenance and data differencing for workflow reproducibility analysis
    Missier, Paolo
    Woodman, Simon
    Hiden, Hugo
    Watson, Paul
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (04): : 995 - 1015
  • [38] Answering Regular Path Queries on Workflow Provenance
    Huang, Xiaocheng
    Bao, Zhuowei
    Davidson, Susan B.
    Milo, Tova
    Yuan, Xiaojie
    2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 375 - 386
  • [39] Enabling Annotation Provenance in Bioinformatics Workflow Applications
    Guimaraes, Milene Pereira
    Cavalcanti, Maria Claudia
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2010, 6268 : 63 - 66
  • [40] Where Provenance in Database Storage
    Rasin, Alexander
    Malik, Tanu
    Wagner, James
    Kim, Caleb
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, IPAW 2018, 2018, 11017 : 231 - 235