Personal adaptive clusters as containers for scientific jobs

被引:0
|
作者
Edward Walker
Jeffrey P. Gardner
Vladimir Litvin
Evan L. Turner
机构
[1] University of Texas at Austin,Texas Advanced Computing Center
[2] Pittsburgh Supercomputing Center,High Energy Physics Group
[3] California Institute of Technology,Texas Advanced Computing Center
[4] University of Texas at Austin,undefined
关键词
Cooperative systems; Distributed computing; Resource management;
D O I
暂无
中图分类号
学科分类号
摘要
We describe a system for creating personal clusters in user-space to support the submission and management of thousands of compute-intensive serial jobs to the network-connected compute resources on the NSF TeraGrid. The system implements a robust infrastructure that submits and manages job proxies across a distributed computing environment. These job proxies contribute resources to personal clusters created dynamically for a user on-demand. The personal clusters then adapt to the prevailing job load conditions at the distributed sites by migrating job proxies to sites expected to provide resources more quickly. Furthermore, the system allows multiple instances of these personal clusters to be created as containers for individual scientific experiments, allowing the submission environment to be customized for each instance. The version of the system described in this paper allows users to build large personal Condor and Sun Grid Engine clusters on the TeraGrid. Users then manage their scientific jobs, within each personal cluster, with a single uniform interface using the feature-rich functionality found in these job management environments.
引用
收藏
页码:339 / 350
页数:11
相关论文
共 50 条
  • [1] Personal adaptive clusters as containers for scientific jobs
    Walker, Edward
    Gardner, Jeffrey P.
    Litvin, Vladimir
    Turner, Evan L.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2007, 10 (03): : 339 - 350
  • [2] Creating personal adaptive clusters for managing scientific jobs in a distributed computing environment
    Walker, Edward
    Gardner, Jeffrey P.
    Litvin, Vladimir
    Turner, Evan L.
    CHALLENGES OF LARGE APPLICATIONS IN DISTRIBUTED ENVIRONMENTS, PROCEEDINGS, 2006, : 95 - +
  • [3] Self-Scaling Clusters and Reproducible Containers to Enable Scientific Computing
    Vaillancourt, Peter Z.
    Coulter, J. Eric
    Knepper, Richard
    Barker, Brandon
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [4] PERSONAL NITROGLYCERIN CONTAINERS
    GUERNSEY, BG
    DOUTRE, WH
    INGRIM, NB
    HOKANSON, JA
    DUNN, JK
    BRYANT, SG
    DRUG INTELLIGENCE & CLINICAL PHARMACY, 1983, 17 (10): : 754 - 755
  • [5] Personal care jobs
    Anon
    Chemical and Engineering News, 2001, 79 (16):
  • [6] INTERVIEWING FOR SCIENTIFIC JOBS
    ARELLANO, S
    CULVER, B
    DAHLQUIST, R
    YAGO, L
    SCIENTIST, 1987, 1 (22): : 29 - 29
  • [7] Containers and Reproducibility in Scientific Research
    Apostal, Sara Faraji Jalal
    Apostal, David
    Marsh, Ronald
    2018 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT), 2018, : 525 - 530
  • [8] Adaptive memory allocations in clusters to handle unexpectedly large data-intensive jobs
    Xiao, L
    Chen, SQ
    Zhang, XD
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2004, 15 (07) : 577 - 592
  • [9] Network-Adaptive Scheduling of Data-Intensive Parallel Jobs with Dependencies in Clusters
    Wang, Shaoqi
    Zhou, Xiaobo
    Zhang, Liqiang
    Jiang, Changjun
    2017 IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC COMPUTING (ICAC), 2017, : 155 - 160
  • [10] Scientific Jobs - an industrial perspective
    Concha, Nestor Omar
    FASEB JOURNAL, 2013, 27