Managing Provenance of Implicit Data Flows in Scientific Experiments

被引:1
|
作者
Neves, Vitor C. [1 ,3 ]
De Oliveira, Daniel [1 ,3 ]
Ocana, Kary A. C. S. [2 ]
Braganholo, Vanessa [1 ,3 ]
Murta, Leonardo [1 ,3 ]
机构
[1] Univ Fed Fluminense, Niteroi, RJ, Brazil
[2] Lab Nacl Comp Cient, Av Getulio Vargas 333, BR-25651075 Petropolis, RJ, Brazil
[3] Inst Comp, Rua Passo da Patria 156, BR-24210240 Niteroi, RJ, Brazil
关键词
Implicit data flows; implicit provenance; scientific experiments; workflows; SYSTEM;
D O I
10.1145/3053372
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific experiments modeled as scientific workflows may create, change, or access data products not explicitly referenced in the workflow specification, leading to implicit data flows. The lack of knowledge about implicit data flows makes the experiments hard to understand and reproduce. In this article, we present ProvMonitor, an approach that identifies the creation, change, or access to data products even within implicit data flows. ProvMonitor links this information with the workflow activity that generated it, allowing for scientists to compare data products within and throughout trials of the same workflow, identifying side effects on data evolution caused by implicit data flows. We evaluated ProvMonitor and observed that it could answer queries for scenarios that demand specific knowledge related to implicit provenance.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Managing big data experiments on smartphones
    Georgios Larkou
    Marios Mintzis
    Panayiotis G. Andreou
    Andreas Konstantinidis
    Demetrios Zeinalipour-Yazti
    Distributed and Parallel Databases, 2016, 34 : 33 - 64
  • [42] Towards integration of data-driven agronomic experiments with data provenance
    Serra da Cruz, Sergio Manuel
    Pires do Nascimento, Jose Antonio
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 161 : 14 - 28
  • [43] Towards an Adaptive and Distributed Architecture for Managing Workflow Provenance Data
    Costa, Flavio
    de Oliveira, Daniel
    Mattoso, Marta
    2014 IEEE 10TH INTERNATIONAL CONFERENCE ON ESCIENCE WORKSHOPS (ESCIENCE 2014), VOL 2, 2014, : 79 - 82
  • [44] A Middleware for Managing Big-Data Flows
    Gupta, Rajeev
    Gupta, Himanshu
    Gupta, Sanjeev
    Padmanabhan, Sriram
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT II, 2013, 8181 : 410 - 424
  • [45] Managing Provenance in iRODS
    Weise, Andrea
    Hasan, Adil
    Hedges, Mark
    Jensen, Jens
    COMPUTATIONAL SCIENCE - ICCS 2009, 2009, 5545 : 667 - +
  • [46] Managing large volumes of distributed scientific data
    Johnston, Steven
    Fangohr, Hans
    Cox, Simon J.
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 3, 2008, 5103 : 339 - 348
  • [47] Processing and Managing Scientific Data in SOA Environment
    Shishedjiev, Bogdan
    Goranova, Mariana
    Georgieva, Juliana
    Gancheva, Veska
    AIC '09: PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON APPLIED INFORMATICS AND COMMUNICATIONS: RECENT ADVANCES IN APPLIED INFORMAT AND COMMUNICATIONS, 2009, : 25 - +
  • [48] A framework for collecting provenance in data-centric scientific workflows
    Simmhan, Yogesh L.
    Plale, Beth
    Gannon, Dennis
    ICWS 2006: IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, PROCEEDINGS, 2006, : 427 - +
  • [49] Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
    Hendrik Nolte
    Philipp Wieder
    Data Intelligence, 2022, (02) : 426 - 438
  • [50] FAIR data pipeline: provenance-driven data management for traceable scientific workflows
    Mitchell, Sonia Natalie
    Lahiff, Andrew
    Cummings, Nathan
    Hollocombe, Jonathan
    Boskamp, Bram
    Field, Ryan
    Reddyhoff, Dennis
    Zarebski, Kristian
    Wilson, Antony
    Viola, Bruno
    Burke, Martin
    Archibald, Blair
    Bessell, Paul
    Blackwell, Richard
    Boden, Lisa A. A.
    Brett, Alys
    Brett, Sam
    Dundas, Ruth
    Enright, Jessica
    Gonzalez-Beltran, Alejandra N. N.
    Harris, Claire
    Hinder, Ian
    Hughes, Christopher David
    Knight, Martin
    Mano, Vino
    McMonagle, Ciaran
    Mellor, Dominic
    Mohr, Sibylle
    Marion, Glenn
    Matthews, Louise
    McKendrick, Iain J. J.
    Pooley, Christopher Mark
    Porphyre, Thibaud
    Reeves, Aaron
    Townsend, Edward
    Turner, Robert
    Walton, Jeremy
    Reeve, Richard
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2022, 380 (2233):