Reproducibility Analysis of Scientific Workflows

被引:4
|
作者
Banati, Anna [3 ]
Kacsuk, Peter [1 ,2 ]
Kozlovszky, Miklos [1 ,3 ]
机构
[1] MTA SZTAKI, Pf 63, H-1518 Budapest, Hungary
[2] Univ Westminster, 115 New Cavendish St, London W1W 6UW, England
[3] Obuda Univ, John von Neumann Fac Informat, Becsi Ut 96-B, H-1034 Budapest, Hungary
关键词
scientific workflows; reproducibility; analytical model; provenance; evaluation; gUSE;
D O I
10.12700/APH.14.2.2017.2.11
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Scientific workflows are efficient tools for specifying and automating compute and data intensive in-silico experiments. An important challenge related to their usage is their reproducibility. In order to make it reproducible, many factors have to be investigated which can influence and even prevent this process: the missing descriptions and samples; the missing provenance data about the environmental parameters and the data dependencies; the dependencies of executions which are based on special hardware, changing or volatile third party services or random generated values. Some of these factors (called dependencies) can be eliminated by careful design or by huge resource usage but most of them cannot be bypassed. Our investigation deals with the critical dependencies of execution. In this paper we set up a mathematical model to evaluate the results of the workflow in addition we provide a mechanism to make the workflow reproducible based on provenance data and statistical tools.
引用
收藏
页码:201 / 217
页数:17
相关论文
共 50 条
  • [1] Classification of Scientific Workflows Based on Reproducibility Analysis
    Banati, A.
    Kacsuk, P.
    Kozlovszky, M.
    2016 39TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2016, : 327 - 331
  • [2] Evaluating the Reproducibility cost of the scientific workflows
    Banati, Anna
    Kacsuk, Peter
    Kozlovszky, Miklos
    2016 IEEE 11TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI), 2016, : 187 - 190
  • [3] Dealing with Reusability and Reproducibility for Scientific Workflows
    Lifschitz, Sergio
    Gomes, Luciana
    Rehen, Stevens K.
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, 2011, : 625 - 632
  • [4] Evaluating the Average Reproducibility Cost of the Scientific Workflows
    Banati, Anna
    Karasz, Peter
    Kacsuk, Peter
    Kozlovszky, Miklos
    2016 IEEE 14TH INTERNATIONAL SYMPOSIUM ON INTELLIGENT SYSTEMS AND INFORMATICS (SISY), 2016, : 79 - 84
  • [5] Computational reproducibility of scientific workflows at extreme scales
    Pouchard, Line
    Baldwin, Sterling
    Elsethagen, Todd
    Jha, Shantenu
    Raju, Bibi
    Stephan, Eric
    Tang, Li
    Van Dam, Kerstin Kleese
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (05): : 763 - 776
  • [6] Experiences with Reproducibility: Case Studies from Scientific Workflows
    Ghoshal, Devarshi
    Paine, Drew
    Pastorello, Gilberto
    Elbashandy, Abdelrahman
    Gunter, Dan
    Amusat, Oluwamayowa
    Ramakrishnan, Lavanya
    PROCEEDINGS OF THE 4TH INTERNATIONAL WORKSHOP ON PRACTICAL REPRODUCIBLE EVALUATION OF COMPUTER SYSTEMS, P-RECS 2021, 2021, : 3 - 8
  • [7] Science Capsule: Towards Sharing and Reproducibility of Scientific Workflows
    Ghoshal, Devarshi
    Bianchi, Ludovico
    Essiari, Abdelilah
    Paine, Drew
    Poon, Sarah S.
    Beach, Michael
    N'Diaye, Alpha T.
    Huck, Patrick
    Ramakrishnan, Lavanya
    PROCEEDINGS OF 16TH WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE (WORKS21), 2021, : 66 - 73
  • [8] Facilitating the Reproducibility of Scientific Workflows with Execution Environment Specifications
    Meng, Haiyan
    Thain, Douglas
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 2017, 108 : 705 - 714
  • [9] Data Provenance and Reproducibility in Grid Based Scientific Workflows
    Tylissanakis, G.
    Cotronis, Y.
    2009 4TH INTERNATIONAL CONFERENCE ON GRID AND PERVASIVE COMPUTING WORKSHOPS: (GPC WORKSHOPS), 2009, : 40 - 47
  • [10] Towards Reproducibility in Scientific Workflows: An Infrastructure-Based Approach
    Santana-Perez, Idafen
    Perez-Hernandez, Maria S.
    SCIENTIFIC PROGRAMMING, 2015, 2015