FINE-GRAINED MULTITHREADING SUPPORT FOR HYBRID THREADED MPI PROGRAMMING

被引:34
|
作者
Balaji, Pavan [1 ]
Buntinas, Darius [1 ]
Goodell, David [1 ]
Gropp, William [2 ]
Thakur, Rajeev [1 ]
机构
[1] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
关键词
MPI; threads; hybrid programming; fine-grained locks;
D O I
10.1177/1094342009360206
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As high-end computing systems continue to grow in scale, recent advances in multi-and many-core architectures have pushed such growth toward more dense architectures, that is, more processing elements per physical node, rather than more physical nodes themselves. Although a large number of scientific applications have relied so far on an MPI-everywhere model for programming high-end parallel systems; this model may not be sufficient for future machines, given their physical constraints such as decreasing amounts of memory per processing element and shared caches. As a result, application and computer scientists are exploring alternative programming models that involve using MPI between address spaces and some other threaded model, such as OpenMP, Pthreads, or Intel TBB, within an address space. Such hybrid models require efficient support from an MPI implementation for MPI messages sent from multiple threads simultaneously. In this paper, we explore the issues involved in designing such an implementation. We present four approaches to building a fully thread-safe MPI implementation, with decreasing levels of critical-section granularity (from coarse-grain locks to fine-grain locks to lock-free operations) and correspondingly increasing levels of complexity. We present performance results that demonstrate the performance implications of the different approaches.
引用
收藏
页码:49 / 57
页数:9
相关论文
共 50 条
  • [1] Fine-grained multithreading with process calculi
    Lopes, L
    Vasconcelos, VT
    Silva, F
    IEEE TRANSACTIONS ON COMPUTERS, 2001, 50 (08) : 852 - 862
  • [2] Supporting fine-grained synchronization on a simultaneous multithreading processor
    Tullsen, DM
    Lo, JL
    Eggers, SJ
    Levy, HM
    FIFTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 1999, : 54 - 58
  • [3] Fine-grained multithreading on the Cray T3E
    Grävinghoff, A
    Keller, J
    HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING '99, 2000, : 447 - 456
  • [4] FINE-GRAINED MULTITHREADING FOR THE MULTIFRONTAL QR FACTORIZATION OF SPARSE MATRICES
    Buttari, Alfredo
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2013, 35 (04): : C323 - C345
  • [5] Fine-Grained Timing Using Genetic Programming
    White, David R.
    Tapiador, Juan M. E.
    Hernandez-Castro, Julio Cesar
    Clark, John A.
    GENETIC PROGRAMMING, PROCEEDINGS, 2010, 6021 : 325 - +
  • [6] Fine-Grained Synchronizations and Dataflow Programming on GPUs
    Li, Ang
    van den Braak, Gert-Jan
    Corporaal, Henk
    Kumar, Akash
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 109 - 118
  • [7] Predictable fine-grained cache behavior for enhanced simultaneous multithreading (SMT) scheduling
    Kihm, JL
    Janiszewski, AW
    Connors, DA
    International Conference on Computing, Communications and Control Technologies, Vol 1, Proceedings, 2004, : 405 - 409
  • [8] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [9] Fine grained multithreading with process calculi
    Lopes, L
    Silva, F
    Vasconcelos, VT
    2000 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 2000, : 217 - 226
  • [10] Tool support for fine-grained software inspection
    Anderson, P
    Reps, T
    Teitelbaum, T
    Zarins, M
    IEEE SOFTWARE, 2003, 20 (04) : 42 - +