Scientific workflow management and the Kepler system

被引:832
|
作者
Ludascher, Bertram [1 ]
Altintas, Ilkay
Berkley, Chad
Higgins, Dan
Jaeger, Efrat
Jones, Matthew
Lee, Edward A.
Tao, Jing
Zhao, Yang
机构
[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
[2] Univ Calif Davis, Genome Ctr, Davis, CA 95616 USA
[3] Univ Calif San Diego, San Diego Supercomp Ctr, San Diego, CA 92093 USA
[4] Univ Calif Santa Barbara, Natl Ctr Ecol Anal & Synth, Santa Barbara, CA 93101 USA
[5] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
来源
关键词
scientific workflows; Grid workflows; scientific data management; problem-solving environments; dataflow networks;
D O I
10.1002/cpe.994
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many scientific disciplines are now data and information driven, and new scientific knowledge is often gained by scientists putting together data analysis and knowledge discovery 'pipelines'. A related trend is that more and more scientific communities realize the benefits of sharing their data and computational services, and are thus contributing to a distributed data and computational community infrastructure (a.k.a. 'the Grid'). However, this infrastructure is only a means to an end and ideally scientists should not be too concerned with its existence. The goal is for scientists to focus on development and use of what we call scientific workflows. These are networks of analytical steps that may involve, e.g., database access and querying steps, data analysis and mining steps, and many other steps including computationally intensive jobs on high-performance cluster computers. In this paper we describe characteristics of and requirements for scientific workflows as identified in a number of our application projects. We then elaborate on Kepler, a particular scientific workflow system, currently under development across a number of scientific data management projects. We describe some key features of Kepler and its underlying Ptolemy II system, planned extensions, and areas of future research. Kepler is a community-driven, open source project, and we always welcome related projects and new contributors to join. Copyright (c) 2005 John Wiley & Sons, Ltd.
引用
收藏
页码:1039 / 1065
页数:27
相关论文
共 50 条
  • [1] Integrated Machine Learning in the Kepler Scientific Workflow System
    Nguyen, Mai H.
    Crawl, Daniel
    Masoumi, Tahereh
    Altintas, Ilkay
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 : 2443 - 2448
  • [2] Provenance collection support in the Kepler Scientific Workflow System
    Altintas, Ilkay
    Barney, Oscar
    Jaeger-Frank, Efrat
    PROVENANCE AND ANNOTATION OF DATA, 2006, 4145 : 118 - 132
  • [3] Early Cloud Experiences with the Kepler Scientific Workflow System
    Wang, Jianwu
    Altintas, Ilkay
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 1630 - 1634
  • [4] Application Scenarios Using Serpens Suite for Kepler Scientific Workflow System
    Plociennik, Marcin
    Owsiak, Michal
    Zok, Tomasz
    Palak, Bartek
    Gomez-Iglesias, Antonio
    Castejon, Francisco
    Lopez-Caniego, Marcos
    Campos Plasencia, Isabel
    Costantini, Alessandro
    Yadykin, Dimitriy
    Strand, Par
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 1604 - 1613
  • [5] A Framework for Distributed Data-Parallel Execution in the Kepler Scientific Workflow System
    Wang, Jianwu
    Crawl, Daniel
    Altintas, Ilkay
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 1620 - 1629
  • [6] Sliding Window Calculations on Streaming Data using the Kepler Scientific Workflow System
    Koehler, Sven
    Gulati, Supriya
    Cao, Gongjing
    Hart, Quinn
    Ludaescher, Bertram
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 1639 - 1646
  • [7] Scientific Workflow: Modeling Methods and Management System
    Sun, Xiaoya
    Hu, Liang
    Che, Xilong
    2018 INTERNATIONAL CONFERENCE ON COMPUTER INFORMATION SCIENCE AND APPLICATION TECHNOLOGY, 2019, 1168
  • [8] Scientific Workflow Approach (Kepler) for Carbon flux data processing
    Liu, Min
    He, Honglin
    Sun, Xiaomin
    Yu, Guirui
    ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL I, PROCEEDINGS, 2009, : 694 - 697
  • [9] The Design of Flexible Workflow in Scientific Research Management System
    Gao, Yongping
    Guan, Fenfen
    2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND BUSINESS ANALYTICS (ICDSBA 2018), 2018, : 61 - 64
  • [10] Integrating existing scientific workflow systems: The Kepler/Pegasus example
    Mandal, Nandita
    Deelman, Ewa
    Mehta, Gaurang
    Su, Mei-Hui
    Vahi, Karan
    Proceedings of the 2nd Workshop on Workflows in Support of Large-scale Science, WORKS'07, 2007, : 21 - 28