Cloud MapReduce for Monte Carlo bootstrap applied to Metabolic Flux Analysis

被引:12
|
作者
Dalman, Tolga [1 ]
Doernemann, Tim [2 ,3 ]
Juhnke, Ernst [2 ,3 ]
Weitzel, Michael [1 ]
Wiechert, Wolfgang [1 ]
Noeh, Katharina [1 ]
Freisleben, Bernd [2 ,3 ]
机构
[1] Forschungszentrum Julich, Inst Bio & Geosci Biotechnol 2 1, D-52428 Julich, Germany
[2] Univ Marburg, Dept Math & Comp Sci, D-35032 Marburg, Germany
[3] Univ Marburg, Ctr Synthet Microbiol, D-35032 Marburg, Germany
关键词
Metabolic Flux Analysis; Cloud computing; Scientific workflows; Hadoop; MapReduce; Monte Carlo bootstrap; WORKFLOWS;
D O I
10.1016/j.future.2011.10.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The MapReduce architectural pattern popularized by Google has successfully been utilized in several scientific applications. Up until now, MapReduce is rarely employed in the field of Systems Biology. We investigate whether a MapReduce approach utilizing on-demand resources from a Cloud is suitable to perform simulation tasks in the area of Metabolic Flux Analysis (MFA). An Amazon ElasticMapReduce Cloud implementation of the parallel, parametric Monte Carlo bootstrap in the context to C-13-MFA is presented. The seamless integration of the application into a service-oriented, BPEL-based scientific workflow framework is shown. A comparison of a straightforward MapReduce implementation using the Hadoop streaming interface on various Amazon ElasticMapReduce instance types and a single CPU core computation approach reveals a speedup of 17 on 64 Amazon cores. I/O operations on many small files within the Reduce step were identified as the limiting step. By exploiting the Hadoop Java API, making use of built-in data types and tuning problem-specific Hadoop parameters, the I/O issues could be resolved. With the revised implementation, a speedup of up to 48 could be achieved on 64 Amazon cores. To investigate the runtimes of a realistic C-13-MFA analysis, 50,000 Monte Carlo samples with a typical metabolic network model have been performed on 20 virtual nodes in 24 h and 23 min with a total cost of $384. Our work demonstrates the possibility to perform scalable Systems Biology applications using Amazon's Cloud MapReduce service. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:582 / 590
页数:9
相关论文
共 50 条
  • [1] Monte-Carlo Simulation in a Cloud Computing Environment with MapReduce
    Pratx, G.
    Xing, L.
    [J]. MEDICAL PHYSICS, 2011, 38 (06)
  • [2] Monte Carlo simulation of photon migration in a cloud computing environment with MapReduce
    Pratx, Guillem
    Xing, Lei
    [J]. JOURNAL OF BIOMEDICAL OPTICS, 2011, 16 (12)
  • [3] GATE Monte Carlo simulation of dose distribution using MapReduce in a cloud computing environment
    Liu, Yangchuan
    Tang, Yuguo
    Gao, Xin
    [J]. AUSTRALASIAN PHYSICAL & ENGINEERING SCIENCES IN MEDICINE, 2017, 40 (04) : 777 - 783
  • [4] GATE Monte Carlo simulation of dose distribution using MapReduce in a cloud computing environment
    Yangchuan Liu
    Yuguo Tang
    Xin Gao
    [J]. Australasian Physical & Engineering Sciences in Medicine, 2017, 40 : 777 - 783
  • [5] A MONTE-CARLO EVALUATION OF THE BOOTSTRAP
    SCHOPFLOCHER, D
    REDDON, JR
    [J]. BIOMETRICS, 1984, 40 (04) : 1206 - 1207
  • [6] Monte Carlo approach to the conformal bootstrap
    Laio, Alessandro
    Valenzuela, Uriel Luviano
    Serone, Marco
    [J]. PHYSICAL REVIEW D, 2022, 106 (02)
  • [7] Monte Carlo approximation of bootstrap variances
    Booth, JG
    Sarkar, S
    [J]. AMERICAN STATISTICIAN, 1998, 52 (04): : 354 - 357
  • [8] Bootstrap, an alternative to Monte Carlo simulation
    Yang, ZR
    Zwolinski, M
    Chalk, CD
    [J]. ELECTRONICS LETTERS, 1998, 34 (12) : 1174 - 1175
  • [9] Markov chain Monte Carlo algorithm based metabolic flux distribution analysis on Corynebacterium glutamicum
    Kadirkamanathan, Visakan
    Yang, Jing
    Billings, Stephen A.
    Wright, Phillip C.
    [J]. BIOINFORMATICS, 2006, 22 (21) : 2681 - 2687
  • [10] BOOTSTRAP RECYCLING - A MONTE-CARLO ALTERNATIVE TO THE NESTED BOOTSTRAP
    NEWTON, MA
    GEYER, CJ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (427) : 905 - 912