Tracking provenance in a virtual data grid

被引:32
|
作者
Clifford, Ben [1 ]
Foster, Ian [1 ,2 ]
Voeckler, Jens-S. [3 ]
Wilder, Michael [1 ,2 ]
Zhao, Yong [1 ]
机构
[1] Univ Chicago, Computat Inst, Chicago, IL 60637 USA
[2] Argonne Natl Lab, Div Math & Comp Sci, Argonne, IL 60439 USA
[3] USC Informat Sci Inst, Marina Del Rey, CA USA
来源
关键词
grid computing; workflow; data provenance;
D O I
10.1002/cpe.1256
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The virtual data model allows data sets to be described prior to, and separately from, their physical materialization. We have implemented this model in a Virtual Data Language (VDL) and associated supporting tools, which provide for both the storage, query, and retrieval of virtual data set descriptions, and the automated, on-demand materialization of virtual data sets. We use a standardized data provenance challenge exercise to illustrate the powerful queries that can be performed on the data maintained by these tools, which for a single virtual data set can include three elements: the computational procedure(s) that must be executed to materialize the data set, the runtime log(s) produced by the execution of the computation(s), and optional metadata annotation(s) that associate application semantics with data and procedures. Copyright (C) 2007 John Wiley & Sons, Ltd.
引用
收藏
页码:565 / 575
页数:11
相关论文
共 50 条
  • [1] Provenance tracking in the ViroLab virtual laboratory
    Balis, Bartosz
    Bubak, Marian
    Wach, Jakub
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2008, 4967 : 381 - +
  • [2] Provenance Tracking and Querying in the ViroLab Virtual Laboratory
    Balis, Bartosz
    Bubak, Marian
    Pelczar, Michal
    Wach, Jakub
    [J]. CCGRID 2008: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, VOLS 1 AND 2, PROCEEDINGS, 2008, : 675 - +
  • [3] Applying the virtual data provenance model
    Zhao, Yong
    Wilde, Michael
    Foster, Ian
    [J]. PROVENANCE AND ANNOTATION OF DATA, 2006, 4145 : 148 - 161
  • [4] Data Provenance Tracking for Concurrent Programs
    Lucia, Brandon
    Ceze, Luis
    [J]. 2015 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2015, : 146 - 156
  • [5] A Provenance Tracking Model for Data Updates
    Ciobanu, Gabriel
    Horne, Ross
    [J]. ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2012, (91): : 31 - 44
  • [6] Tracking provenance of earth science data
    Curt Tilmes
    Yelena Yesha
    Milton Halem
    [J]. Earth Science Informatics, 2010, 3 : 59 - 65
  • [7] Tracking provenance of earth science data
    Tilmes, Curt
    Yesha, Yelena
    Halem, Milton
    [J]. EARTH SCIENCE INFORMATICS, 2010, 3 (1-2) : 59 - 65
  • [8] Storing, Tracking, and Querying Provenance in Linked Data
    Wylot, Marcin
    Cudre-Mauroux, Philippe
    Hauswirth, Manfred
    Groth, Paul
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (08) : 1751 - 1764
  • [9] A virtual data grid for LIGO
    Deelman, E
    Kesselman, C
    Williams, R
    Lazzarini, A
    Prince, TA
    Romano, J
    Allen, B
    [J]. HIGH-PERFORMANCE COMPUTING AND NETWORKING, 2001, 2110 : 3 - 12
  • [10] Oceanographic Data Provenance Tracking with the Shore Side Data System
    McCann, Michael
    Gomes, Kevin
    [J]. PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, 2008, 5272 : 309 - 322