Managing Provenance of Implicit Data Flows in Scientific Experiments

被引:1
|
作者
Neves, Vitor C. [1 ,3 ]
De Oliveira, Daniel [1 ,3 ]
Ocana, Kary A. C. S. [2 ]
Braganholo, Vanessa [1 ,3 ]
Murta, Leonardo [1 ,3 ]
机构
[1] Univ Fed Fluminense, Niteroi, RJ, Brazil
[2] Lab Nacl Comp Cient, Av Getulio Vargas 333, BR-25651075 Petropolis, RJ, Brazil
[3] Inst Comp, Rua Passo da Patria 156, BR-24210240 Niteroi, RJ, Brazil
关键词
Implicit data flows; implicit provenance; scientific experiments; workflows; SYSTEM;
D O I
10.1145/3053372
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific experiments modeled as scientific workflows may create, change, or access data products not explicitly referenced in the workflow specification, leading to implicit data flows. The lack of knowledge about implicit data flows makes the experiments hard to understand and reproduce. In this article, we present ProvMonitor, an approach that identifies the creation, change, or access to data products even within implicit data flows. ProvMonitor links this information with the workflow activity that generated it, allowing for scientists to compare data products within and throughout trials of the same workflow, identifying side effects on data evolution caused by implicit data flows. We evaluated ProvMonitor and observed that it could answer queries for scenarios that demand specific knowledge related to implicit provenance.
引用
收藏
页数:22
相关论文
共 50 条
  • [11] LabelFlow: Exploiting Workflow Provenance to Surface Scientific Data Provenance
    Alper, Pinar
    Belhajjame, Khalid
    Goble, Carole A.
    Karagoz, Pinar
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES (IPAW 2014), 2015, 8628 : 84 - 96
  • [12] Generating Scientific Documentation for Computational Experiments Using Provenance
    Wibisono, Adianto
    Bloem, Peter
    de Vries, Gerben K. D.
    Groth, Paul
    Belloum, Adam
    Bubak, Marian
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES (IPAW 2014), 2015, 8628 : 168 - 179
  • [13] Provenance: The bridge between experiments and data
    Miles, Simon
    Groth, Paul
    Deelman, Ewa
    Vahi, Karan
    Mehta, Gaurang
    Moreau, Luc
    COMPUTING IN SCIENCE & ENGINEERING, 2008, 10 (03) : 38 - 46
  • [14] Querying and managing provenance through user views in scientific workflows
    Biton, Olivier
    Cohen-Boulakia, Sarah
    Davidson, Susan B.
    Hara, Carmern S.
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1072 - +
  • [15] MANAGING RETURN FLOWS BY SCIENTIFIC SCHEDULING OF IRRIGATIONS
    ENGLISH, MJ
    ORLOB, GT
    PROGRESS IN WATER TECHNOLOGY, 1979, 11 (06): : 405 - 414
  • [16] Managing Data Provenance in Genome Project Workflows
    de Paula, Renato
    Holanda, Maristela T.
    Walter, Maria Emilia M. T.
    Lifschitz, Sergio
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [17] Permissioned Blockchain for Data Provenance in Scientific Data Management
    Moeller, Julius
    Froeschle, Sibylle
    Hahn, Axel
    INNOVATION THROUGH INFORMATION SYSTEMS, VOL III: A COLLECTION OF LATEST RESEARCH ON MANAGEMENT ISSUES, 2021, 48 : 22 - 38
  • [18] A Scientific Data Provenance API for Distributed Applications
    Raju, Bibi
    Elsethagen, Todd
    Stephan, Eric
    Van Dam, Kerstin Kleese
    2016 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2016, : 104 - 111
  • [19] Cell-based Provenance for Scientific Data
    Park, Juyeong
    Yoshikawa, Masatoshi
    Kato, Hiroyuki
    2017 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2017), 2017, : 289 - 290
  • [20] Visualizing Large Scale Scientific Data Provenance
    Chen, Peng
    Plale, Beth
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1385 - 1386