The hector distributed run-time environment

被引:12
|
作者
Russ, SH
Robinson, J
Flachs, BK
Heckel, B
机构
[1] Mississippi State Univ, Engn Res Ctr, Mississippi State, MS 39762 USA
[2] Adv Microelect, Ridgeland, MS 39157 USA
[3] IBM Corp, Austin Res Lab, Austin, TX 78758 USA
[4] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
关键词
parallel computing; load balancing; fault tolerance; resource allocation; task migration;
D O I
10.1109/71.735957
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Harnessing the computational capabilities of a network of workstations promises to off-load work from overloaded supercomputers onto largely idle resources overnight. Several capabilities are needed to do this, including support for an architecture-independent parallel programming environment, task migration, automatic resource allocation, and fault tolerance. the Hector distributed run-time environment is designed to present these capabilities transparently to programmers. MPI programs can be run under this environment on homogeneous clusters with no modifications to their source code needed. The design of Hector, its internal structure, and several benchmarks and tests are presented.
引用
收藏
页码:1102 / 1114
页数:13
相关论文
共 50 条
  • [21] DRASTIC: A run-time architecture for evolving, distributed, persistent systems
    Evans, H
    Dickman, P
    [J]. ECOOP'97: OBJECT-ORIENTED PROGRAMMING, 1997, 1241 : 243 - 275
  • [22] A Run-time Infrastructure based on Service-Distributed Architecture
    Wang, Zhiteng
    Zhang, Hongjun
    Zhang, Rui
    Li, Yong
    Xu, Baoyu
    [J]. APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (02): : 595 - 604
  • [23] Validating run-time interactions in distributed Java']Java applications
    Ghosh, S
    Bawa, N
    Goel, S
    Reddy, YR
    [J]. EIGHTH IEEE INTERNATIONAL CONFERENCE ON ENGINEERING OF COMPLEX COMPUTER SYSTEMS, PROCEEDINGS, 2002, : 7 - 16
  • [24] Interoperable Run-Time Tools for Distributed Systems—A Case Study
    Roland Wismüller
    Thomas Ludwig
    [J]. The Journal of Supercomputing, 2000, 17 : 277 - 289
  • [25] Automatic Generation of Distributed Run-time Infrastructure for Internet of Things
    Mohamed, Saleh
    Forshaw, Matthew
    Thomas, Nigel
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ARCHITECTURE WORKSHOPS (ICSAW), 2017, : 100 - 107
  • [26] Interoperable run-time tools for distributed systems -: A case study
    Wismüller, R
    Ludwig, T
    [J]. JOURNAL OF SUPERCOMPUTING, 2000, 17 (03): : 277 - 289
  • [27] Interoperable run-time tools for distributed systems -: A case study
    Wismüller, R
    Ludwig, T
    [J]. INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, PROCEEDINGS, 1999, : 1763 - 1769
  • [28] Modeling, run-time optimization and execution of distributed workflow applications in the JEE-based BeesyCluster environment
    Czarnul, Pawel
    [J]. JOURNAL OF SUPERCOMPUTING, 2013, 63 (01): : 46 - 71
  • [29] Auto source code generation and run-time infrastructure and environment for high performance, distributed computing systems
    Patel, MI
    Jordan, K
    Clark, M
    Bhatt, D
    [J]. PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2000, 1800 : 816 - 822
  • [30] Modeling, run-time optimization and execution of distributed workflow applications in the JEE-based BeesyCluster environment
    Pawel Czarnul
    [J]. The Journal of Supercomputing, 2013, 63 : 46 - 71