Aeromancer: A Workflow Manager for Large-Scale MapReduce-Based Scientific Workflows

被引:1
|
作者
Mohamed, Nabeel [1 ]
Maji, Nabanita [1 ]
Zhang, Jing [1 ]
Timoshevskaya, Nataliya [1 ]
Feng, Wu-Chun [1 ]
机构
[1] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24061 USA
关键词
PLATFORM; GALAXY; CLOUDMAN; TAVERNA; TOOL;
D O I
10.1109/TrustCom.2014.97
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Hadoop framework has gained significant attention from the scientific community due to its applicability to large-scale data analysis in many areas. This analysis often involves multiple stages of processing, which in turn, constitutes a workflow. While some stages of a workflow are mandatory, others are subject to the type of analysis to be done. In addition, a workflow may possess data dependencies between stages that must be enforced, and it may exhibit varying levels of sensitivity. The resources needed for such data analysis can range from a laptop to in-house clusters (or private cloud) to a public cloud. Managing such workflows, while using such a gamut of computing resources, is an unnecessarily arduous task for domain scientists. To address the above challenges, we present Aeromancer, a feature-rich workflow manager for running MapReduce-based workflows that utilizes both client and cloud resources. Aeromancer offers an ensemble of features, including the simultaneous use of client resources (e.g., on-premises clusters) and public cloud resources; automatic data-dependency and data-transfer handling; intra-flow, on-demand cluster provisioning; and support for directed-acyclic graphs (DAGs). To demonstrate its functionality, we apply Aeromancer to several bioinformatics pipelines, as part of a "big data" case study in the life sciences, which seeks to increase the adoption of hybrid computing environments, including the emerging "client+cloud" computing model, for running data-intensive workflows.
引用
收藏
页码:739 / 746
页数:8
相关论文
共 50 条
  • [41] Large-Scale Deep Belief Nets With MapReduce
    Zhang, Kunlei
    Chen, Xue-Wen
    [J]. IEEE ACCESS, 2014, 2 : 395 - 403
  • [42] Large-Scale Frequent Subgraph Mining in MapReduce
    Lin, Wenqing
    Xiao, Xiaokui
    Ghinita, Gabriel
    [J]. 2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 844 - 855
  • [43] MapReduce in MPI for Large-scale graph algorithms
    Plimpton, Steven J.
    Devine, Karen D.
    [J]. PARALLEL COMPUTING, 2011, 37 (09) : 610 - 632
  • [44] Large-scale Neural Modeling in MapReduce and Giraph
    Yang, Shuo
    Spielman, Nicholas D.
    Jackson, Jadin C.
    Rubin, Brad S.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT), 2014, : 556 - 561
  • [45] Extreme Learning Machine for large-scale graph classification based on MapReduce
    Wang, Zhanghui
    Zhao, Yuhai
    Yuan, Ye
    Wang, Guoren
    Chen, Lei
    [J]. NEUROCOMPUTING, 2017, 261 : 106 - 114
  • [46] Key Nodes Discovery in Large-Scale Logistics Network Based on MapReduce
    Sun, Yuan
    Ma, Yunlong
    Zhang, Feng
    Ma, Yumin
    Shen, Weiming
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 1309 - 1314
  • [47] Extreme Learning Machine for Large-Scale Graph Classification Based on MapReduce
    Wang, Zhanghui
    Zhao, Yuhai
    Wang, Guoren
    [J]. PROCEEDINGS OF ELM-2015, VOL 1: THEORY, ALGORITHMS AND APPLICATIONS (I), 2016, 6 : 93 - 105
  • [48] Social Relation Extraction of Large-Scale Logistics Network Based on MapReduce
    Gui, Feng
    Zhang, Feng
    Ma, Yunlong
    Liu, Min
    Shen, Weiming
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2273 - 2277
  • [49] Workflow-based large-scale integration hospital system
    Chang, L
    Yang, SJ
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, VOL 1, 2004, : 626 - 630
  • [50] Distributed Throughput Optimization for Large-Scale Scientific Workflows Under Fault-Tolerance Constraint
    Gu, Yi
    Wu, Chase Qishi
    Liu, Xin
    Yu, Dantong
    [J]. JOURNAL OF GRID COMPUTING, 2013, 11 (03) : 361 - 379