Reusability First: Toward FAIR Workflows

被引:9
|
作者
Wolf, Matthew [1 ]
Logan, Jeremy [1 ]
Mehta, Kshitij [1 ]
Jacobson, Daniel [1 ,2 ]
Cashman, Mikaela [1 ]
Walker, Angelica M. [2 ]
Eisenhauer, Greg [4 ]
Widener, Patrick [3 ]
Cliff, Ashley [2 ]
机构
[1] Oak Ridge Natl Lab, Oak Ridge, TN 37830 USA
[2] Univ Tennessee, Bredesen Ctr Interdisciplinary Res & Grad Educ, Knoxville, TN USA
[3] Sandia Natl Labs, POB 5800, Albuquerque, NM 87185 USA
[4] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
Workflows; FAIR; Reusability; Distributed Information systems; Middleware; SCIENCE;
D O I
10.1109/Cluster48925.2021.00053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The FAIR principles of open science (Findable, Accessible, Interoperable, and Reusable) have had transformative effects on modern large-scale computational science. In particular, they have encouraged more open access to and use of data, an important consideration as collaboration among teams of researchers accelerates and the use of workflows by those teams to solve problems increases. How best to apply the FAIR principles to workflows themselves, and software more generally, is not yet well understood. We argue that the software engineering concept of technical debt management provides a useful guide for application of those principles to workflows, and in particular that it implies reusability should be considered as 'first among equals'. Moreover, our approach recognizes a continuum of reusability where we can make explicit and selectable the trade-offs required in workflows for both their users and developers. To this end, we propose a new abstraction approach for reusable workflows, with demonstrations for both synthetic workloads and real-world computational biology workflows. Through application of novel systems and tools that are based on this abstraction, these experimental workflows are refactored to right-size the granularity of workflow components to efficiently fill the gap between end-user simplicity and general customizability. Our work makes it easier to selectively reason about and automate the connections between trade-offs across user and developer concerns when exposing degrees of freedom for reuse. Additionally, by exposing fine-grained reusability abstractions we enable performance optimizations, as we demonstrate on both institutional-scale and leadership-class HPC resources.
引用
收藏
页码:444 / 455
页数:12
相关论文
共 50 条
  • [31] Toward the reusability for iterative linear algebra software in distributed environment
    Ernad, N
    Sedrakian, A
    [J]. PARALLEL COMPUTING, 2006, 32 (03) : 251 - 266
  • [32] Toward efficient execution of data-intensive workflows
    Oleg Sukhoroslov
    [J]. The Journal of Supercomputing, 2021, 77 : 7989 - 8012
  • [33] Toward Understanding I/O Behavior in HPC Workflows
    Luettgau, Jakob
    Snyder, Shane
    Carns, Philip
    Wozniak, Justin M.
    Kunkel, Julian
    Ludwig, Thomas
    [J]. PROCEEDINGS OF 2018 IEEE/ACM 3RD JOINT INTERNATIONAL WORKSHOP ON PARALLEL DATA STORAGE & DATA INTENSIVE SCALABLE COMPUTING SYSTEMS (PDSW-DISCS), 2018, : 64 - 75
  • [34] Toward efficient execution of data-intensive workflows
    Sukhoroslov, Oleg
    [J]. JOURNAL OF SUPERCOMPUTING, 2021, 77 (08): : 7989 - 8012
  • [35] Toward Web-scale workflows for film production
    Ouyang, Chun
    La Rosa, Marcello
    ter Hofstede, Arthur H. M.
    Dumas, Marlon
    Shortland, Katherine
    [J]. IEEE INTERNET COMPUTING, 2008, 12 (05) : 53 - 61
  • [36] Toward Fair and Humane Pain Policy
    Goldberg, Daniel S.
    [J]. HASTINGS CENTER REPORT, 2020, 50 (04) : 33 - 36
  • [37] FairAGG: Toward Fair Graph Neural Networks via Fair Aggregation
    Zhu, Yuchang
    Li, Jintang
    Chen, Liang
    Zheng, Zibin
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (05) : 1 - 12
  • [38] From FAIR research data toward FAIR and open research software
    Hasselbring, Wilhelm
    Carr, Leslie
    Hettrick, Simon
    Packer, Heather
    Tiropanis, Thanassis
    [J]. IT-INFORMATION TECHNOLOGY, 2020, 62 (01): : 39 - 47
  • [39] Using interactive Jupyter Notebooks and BioConda for FAIR and reproducible biomolecular simulation workflows
    Bayarri, Genis
    Andrio, Pau
    Gelpi, Josep Lluis
    Hospital, Adam
    Orozco, Modesto
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (06)
  • [40] Toward a Theory of the Evolution of Fair Play
    Schank, Jeffrey C.
    Burghardt, Gordon M.
    Pellis, Sergio M.
    [J]. FRONTIERS IN PSYCHOLOGY, 2018, 9