Datatrack: An R package for managing data in a multi-stage experimental workflow

被引:0
|
作者
Eichinski, Philip [1 ]
Roe, Paul [1 ]
机构
[1] Queensland Univ Technol, Sci & Engn Fac, Brisbane, Qld, Australia
关键词
computational science; data provenance; R language; R package; workflow;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In experimental research using computation, a workflow is a sequence of steps involving some data processing or analysis where the output of one step may be used as the input of another. The processing steps may involve user-supplied parameters, that when modified, result in a new version of input to the downstream steps, in turn generating new versions of their own output. As more experimentation is done, the results of these various steps can become numerous. It is important to keep track of which data output is dependent on which other generated data, and which parameters were used. In many situations, scientific workflow management systems solve this problem, but these systems are best suited to collaborative, distributed experiments using a variety of services, possibly batch processing parameter sweeps. This paper presents an R package for managing and navigating a network of interdependent data. It is intended as a lightweight tool that provides some visual data provenance information to the experimenter to allow them to manage their generated data as they run experiments within their familiar scripting environment, where it may not be desirable to commit to a fully-blown comprehensive workflow manager. The package consists of wrapper functions for writing and reading output data that can be called from within the R analysis scripts, as well as a visualization of the data-output dependency graph rendered within the R-studio console. Thus, it offers benefit to the experimenter while requiring minimal commitment for integration in their existing working environment.
引用
收藏
页码:147 / 154
页数:8
相关论文
共 50 条
  • [41] Multi-stage inverse scattering algorithm for GPR data processing
    Budko, NV
    van den Berg, PM
    SUBSURFACE AND SURFACE SENSING TECHNOLOGIES AND APPLICATIONS III, 2001, 4491 : 131 - 142
  • [42] Multi-stage ensemble with refinement for noisy labeled data classification
    Choi, Chihyeon
    Lee, Woojin
    Son, Youngdoo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [43] MStractor: R Workflow Package for Enhancing Metabolomics Data Pre-Processing and Visualization
    Nicolotti, Luca
    Hack, Jeremy
    Herderich, Markus
    Lloyd, Natoiya
    METABOLITES, 2021, 11 (08)
  • [44] Data processing in luminescence dating analysis: An exemplary workflow using the R package 'Luminescence'
    Fuchs, Margret C.
    Kreutzer, Sebastian
    Burow, Christoph
    Dietze, Michael
    Fischer, Manfred
    Schmidt, Christoph
    Fuchs, Markus
    QUATERNARY INTERNATIONAL, 2015, 362 : 8 - 13
  • [45] Intermediate Data Caching Optimization for Multi-Stage and Parallel Big Data Frameworks
    Yang, Zhengyu
    Jia, Danlin
    Ioannidis, Stratis
    Mi, Ningfang
    Sheng, Bo
    PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2018, : 277 - 284
  • [46] A multi-stage decision framework for managing hazardous waste logistics with random release dates
    Hassanpour, Saeed Tasouji
    Ke, Ginger Y.
    Zhao, Jiahong
    Tulett, David M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 232
  • [47] A package-aware scheduling strategy for edge serverless functions based on multi-stage optimization
    Zheng, Senjiong
    Liu, Bo
    Lin, Weiwei
    Ye, Xiaoying
    Li, Keqin
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 144 : 105 - 116
  • [48] Numerical and Experimental Investigation of a Novel Multi-Stage Falling Particle Receiver
    Kim, Jin-Soo
    Kumar, Apurv
    Gardner, Wilson
    Lipinski, Wojciech
    SOLARPACES 2018: INTERNATIONAL CONFERENCE ON CONCENTRATING SOLAR POWER AND CHEMICAL ENERGY SYSTEMS, 2019, 2126
  • [49] Experimental study on heat and mass transfer of a multi-stage planar dehumidifier
    Li, Chun-Han
    Chen, Chen-Yu
    Yang, Tien-Fu
    Li, Wen-Ken
    Yan, Wei-Mon
    INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 2020, 148
  • [50] Coping with complexity - Experimental evidence for narrow bracketing in multi-stage contests
    Stracke, Rudi
    Kerschbamer, Rudolf
    Sunde, Uwe
    EUROPEAN ECONOMIC REVIEW, 2017, 98 : 264 - 281