A survey of data provenance in e-science

被引:31
|
作者
Simmhan, YL [1 ]
Plale, B [1 ]
Gannon, D [1 ]
机构
[1] Indiana Univ, Dept Comp Sci, Bloomington, IN 47405 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data management is growing in complexity as large-scale applications take advantage of the loosely coupled resources brought together by grid middleware and by abundant storage capacity. Metadata describing the data products used in and generated by these applications is essential to disambiguate the data and enable reuse. Data provenance, one kind of metadata, pertains to the derivation history of a data product starting from its original sources. In this paper we create a taxonomy of data provenance characteristics and apply it to current research efforts in e-science, focusing primarily on scientific workflow approaches. The main aspect of our taxonomy categorizes provenance systems based on why they record provenance, what they describe, how they represent and store provenance, and ways to disseminate it. The survey culminates with an identification of open research problems in the field.
引用
收藏
页码:31 / 36
页数:6
相关论文
共 50 条
  • [1] The requirements of using provenance in e-science experiments
    Miles S.
    Groth P.
    Branco M.
    Moreau L.
    [J]. Journal of Grid Computing, 2007, 5 (1) : 1 - 25
  • [2] Towards a Threat Model for Provenance in e-Science
    Gadelha, Luiz M. R., Jr.
    Mattoso, Marta
    Wilde, Michael
    Foster, Ian
    [J]. PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, 2010, 6378 : 277 - +
  • [3] Towards Next Generation Provenance Systems for e-Science
    Khan, Fakhri Alam
    Hussain, Sardar
    Janciak, Ivan
    Brezany, Peter
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SYSTEM MODELING AND DESIGN, 2011, 2 (03) : 24 - 48
  • [4] Enabling provenance on large scale e-science applications
    Branco, Miguel
    Moreau, Luc
    [J]. PROVENANCE AND ANNOTATION OF DATA, 2006, 4145 : 55 - 63
  • [5] Semantically linking and browsing provenance logs for e-science
    Zhao, J
    Goble, C
    Stevens, R
    Bechhofer, S
    [J]. SEMANTICS OF A NETWORKED WORLD: SEMANTICS FOR GRID DATABASES, 2004, 3226 : 158 - 176
  • [6] Provenance-based validation of E-science experiments
    Wong, SC
    Miles, S
    Fang, WJ
    Groth, P
    Moreau, L
    [J]. SEMANTIC WEB - ISWC 2005, PROCEEDINGS, 2005, 3729 : 801 - 815
  • [7] Provenance-based validation of e-science experiments
    Miles, Simon
    Wong, Sylvia C.
    Fang, Weijian
    Groth, Paul
    Zauner, Klaus-Peter
    Moreau, Luc
    [J]. JOURNAL OF WEB SEMANTICS, 2007, 5 (01): : 28 - 38
  • [8] E-Science and the data deluge
    Casacuberta, David
    Vallverdu, Jordi
    [J]. PHILOSOPHICAL PSYCHOLOGY, 2014, 27 (01) : 126 - 140
  • [9] A survey on semantic e-science applications
    Chen, Huajun
    Ma, Jun
    Wang, Yimin
    Wu, Zhaohui
    [J]. COMPUTING AND INFORMATICS, 2008, 27 (01) : 5 - 20
  • [10] Automatic capture and efficient storage of e-Science experiment provenance
    Barga, Roger S.
    Digiampietri, Luciano A.
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2008, 20 (05): : 419 - 429