Towards an Adaptive and Distributed Architecture for Managing Workflow Provenance Data

被引:3
|
作者
Costa, Flavio [1 ]
de Oliveira, Daniel [2 ]
Mattoso, Marta [1 ]
机构
[1] Univ Fed Rio de Janeiro, COPPE, Rio De Janeiro, Brazil
[2] Fluminense Fed Univ, Comp Inst, Niteroi, RJ, Brazil
关键词
distributed provenance; scientific workflow; scientific workflow management system;
D O I
10.1109/eScience.2014.59
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Workflow provenance data represents the workflow execution behavior, allowing for tracing the generation of the scientific data-flow. Provenance is an important asset to analyze data, identify and handle errors that occurred during the workflow execution through runtime monitoring. The workflow execution engine can also use provenance data to set the initial amount of resources and plan adaptive task scheduling. However, efficiently managing provenance data from distributed workflow execution has several challenges. As the size of workflows increases (in terms of number of activity executions or volume of data to process), the amount of provenance data to be managed also grows, especially in fine grain. Thus, centralized approaches become unviable. In this work we propose an architecture that combines distributed workflow management techniques with distributed provenance data management.
引用
收藏
页码:79 / 82
页数:4
相关论文
共 50 条
  • [1] Integrating Provenance Data from Distributed Workflow Systems with ProvManager
    Marinho, Anderson
    Murta, Leonardo
    Werner, Claudia
    Braganholo, Vanessa
    Ogasawara, Eduardo
    Serra da Cruz, Sergio Manuel
    Mattoso, Marta
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, 2010, 6378 : 286 - +
  • [2] Distributed workflow architecture based on flexible data management
    Zhu, Y. (yinbaolin@nlsde.buaa.edu.cn), 1600, Science and Engineering Research Support Society, 20 Virginia Court, Sandy Bay, Tasmania, Australia (06):
  • [3] Towards Integrating Workflow and Database Provenance
    Chirigati, Fernando
    Freire, Juliana
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, IPAW 2012, 2012, 7525 : 11 - 23
  • [4] On the distributed software architecture of a data analysis workflow: A case study
    Tasgetiren, Nail
    Tigrak, Umit
    Bozan, Erdal
    Gul, Guven
    Demirci, Emir
    Saribiyik, Hakan
    Aktas, Mehmet S.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (09):
  • [5] Hiding Data and Structure in Workflow Provenance
    Davidson, Susan
    Bao, Zhuowei
    Roy, Sudeepa
    DATABASES IN NETWORKED INFORMATION SYSTEMS, 2011, 7108 : 41 - 48
  • [6] Adaptive algorithms for managing a distributed data processing workload
    Aman, J
    Eilert, CK
    Emmes, D
    Yocom, P
    Dillenberger, D
    IBM SYSTEMS JOURNAL, 1997, 36 (02) : 242 - 283
  • [7] Project histories:: Managing data provenance across collection-oriented scientific workflow runs
    Bowers, Shawn
    McPhillips, Timothy
    Wu, Martin
    Ludaescher, Bertram
    DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2007, 4544 : 122 - +
  • [8] Managing data provenance in database
    Liu, Xiping
    Wan, Changxuan
    Jiang, Tengjiao
    Journal of Information and Computational Science, 2009, 6 (01): : 423 - 431
  • [9] LabelFlow: Exploiting Workflow Provenance to Surface Scientific Data Provenance
    Alper, Pinar
    Belhajjame, Khalid
    Goble, Carole A.
    Karagoz, Pinar
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES (IPAW 2014), 2015, 8628 : 84 - 96
  • [10] Distributed Data Store Architecture Towards Colonial Data Replication
    Kumalakov, Bolatzhan
    Bakibayev, Timur
    2017 11TH IEEE INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2017), 2017, : 74 - 79