LabelFlow: Exploiting Workflow Provenance to Surface Scientific Data Provenance

被引:5
|
作者
Alper, Pinar [1 ]
Belhajjame, Khalid [2 ]
Goble, Carole A. [1 ]
Karagoz, Pinar [3 ]
机构
[1] Univ Manchester, Sch Comp Sci, Manchester, Lancs, England
[2] Univ Paris 09, Paris, France
[3] Middle E Tech Univ, Dept Comp Engn, TR-06531 Ankara, Turkey
基金
英国工程与自然科学研究理事会;
关键词
Provenance; Annotation; Scientific workflows; SEMANTIC PROVENANCE; WEB;
D O I
10.1007/978-3-319-16462-5_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Provenance traces captured by scientific workflows can be useful for designing, debugging and maintenance. However, our experience suggests that they are of limited use for reporting results, in part because traces do not comprise domain-specific annotations needed for explaining results, and the black-box nature of some workflow activities. We show that by basic mark-up of the data processing within activities and using a set of domain specific label generation functions, standard workflow provenance can be utilised as a platform for the labelling of data artefacts. These labels can in turn aid selection of data subsets and proxy for data descriptors for shared datasets.
引用
收藏
页码:84 / 96
页数:13
相关论文
共 50 条
  • [41] Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data
    Sahoo, Satya S.
    Bodenreider, Olivier
    Hitzler, Pascal
    Sheth, Amit
    Thirunarayan, Krishnaprasad
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2010, 6187 : 461 - +
  • [42] Exploiting Provenance to Make Sense of Automated Decisions in Scientific Workflows
    Missier, Paolo
    Embury, Suzanne
    Stapenhurst, Richard
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, 2008, 5272 : 174 - 185
  • [43] Bridging Workflow and Data Provenance Using Strong Links
    Koop, David
    Santos, Emanuele
    Bauer, Bela
    Troyer, Matthias
    Freire, Juliana
    Silva, Claudio T.
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2010, 6187 : 397 - +
  • [44] Exploring Scientific Workflow Provenance Using Hybrid Queries over Nested Data and Lineage Graphs
    Anand, Manish Kumar
    Bowers, Shawn
    McPhillips, Timothy
    Ludaescher, Bertram
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2009, 5566 : 237 - +
  • [45] Project histories:: Managing data provenance across collection-oriented scientific workflow runs
    Bowers, Shawn
    McPhillips, Timothy
    Wu, Martin
    Ludaescher, Bertram
    DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2007, 4544 : 122 - +
  • [46] Permissioned Blockchain for Data Provenance in Scientific Data Management
    Moeller, Julius
    Froeschle, Sibylle
    Hahn, Axel
    INNOVATION THROUGH INFORMATION SYSTEMS, VOL III: A COLLECTION OF LATEST RESEARCH ON MANAGEMENT ISSUES, 2021, 48 : 22 - 38
  • [47] A Scientific Data Provenance API for Distributed Applications
    Raju, Bibi
    Elsethagen, Todd
    Stephan, Eric
    Van Dam, Kerstin Kleese
    2016 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2016, : 104 - 111
  • [48] Cell-based Provenance for Scientific Data
    Park, Juyeong
    Yoshikawa, Masatoshi
    Kato, Hiroyuki
    2017 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2017), 2017, : 289 - 290
  • [49] Visualizing Large Scale Scientific Data Provenance
    Chen, Peng
    Plale, Beth
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1385 - 1386
  • [50] Toward the modeling of data provenance in scientific publications
    Mahmood, Tariq
    Jami, Syed Imran
    Shaikh, Zubair Ahmed
    Mughal, Muhammad Hussain
    COMPUTER STANDARDS & INTERFACES, 2013, 35 (01) : 6 - 29