LOAD BALANCING AND DATA LOCALITY IN ADAPTIVE HIERARCHICAL N-BODY METHODS - BARNES-HUT, FAST MULTIPOLE, AND RADIOSITY

被引:90
|
作者
SINGH, JP
HOLT, C
TOTSUKA, T
GUPTA, A
HENNESSY, J
机构
[1] Computer Systems Laboratory, Stanford University, Stanford
关键词
D O I
10.1006/jpdc.1995.1077
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hierarchical N-body methods, which are based on a fundamental insight into the nature of many physical processes, are increasingly being used to solve large-scale problems in a variety of scientific/engineering domains. Applications that use these methods are challenging to parallelize effectively, however, owing to their nonuniform, dynamically changing characteristics and their need for long-range communication. In this paper, we study the partitioning and scheduling techniques required to obtain effective parallel performance on applications that use a range of hierarchical N-body methods. To obtain representative coverage, we first examine applications that use the two best methods known for classical N-body problems: the Barnes-Hut method and the fast multipole method. Then, we examine a recent hierarchical method for radiosity calculations in computer graphics, which applies the hierarchical N-body approach to a problem with very different characteristics. We find that straightforward decomposition techniques which an automatic scheduler might implement do not scale well, because they are unable to simultaneously provide load balancing and data locality. However, all the applications yield very good parallel performance if appropriate partitioning and scheduling techniques are implemented by the programmer. For the applications that use the Barnes-Hut and fast multipole methods, simple yet effective partitioning techniques can be developed by exploiting some key insights into both the methods and the classical problems that they solve. Using a novel partitioning technique, even relatively small problems achieve 45-fold speedups on a 48-processor Stanford DASH machine (a cache-coherent, shared address space multiprocessor) and 118-fold speedups on a 128-processor simulated architecture. The very different characteristics of the radiosity application require a different partitioning/scheduling approach to be used for it; however, it too yields very good parallel performance. (C) 1995 Academic Press, Inc.
引用
收藏
页码:118 / 141
页数:24
相关论文
共 13 条
  • [1] A data parallel formulation of the Barnes-Hut method for N-body simulations
    Amor, M
    Argüello, F
    López, J
    Plata, O
    Zapata, EL
    APPLIED PARALLEL COMPUTING, PROCEEDINGS: NEW PARADIGMS FOR HPC IN INDUSTRY AND ACADEMIA, 2001, 1947 : 342 - 349
  • [2] Scalable parallel formulations of the Barnes-Hut method for n-body simulations
    Grama, A
    Kumar, V
    Sameh, A
    PARALLEL COMPUTING, 1998, 24 (5-6) : 797 - 822
  • [3] Load balancing and locality in hierarchical n-body algorithms on distributed memory architectures
    Baiardi, F
    Becuzzi, P
    Mori, P
    Paoli, M
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 284 - 293
  • [4] A Traversal Cache Framework for FPGA Acceleration of Pointer Data Structures: A Case Study on Barnes-Hut N-body Simulation
    Coole, James
    Wernsing, John
    Stitt, Greg
    2009 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS, 2009, : 143 - 148
  • [5] Implementation and performance of Barnes-hut n-body algorithm on extreme-scale heterogeneous many-core architectures
    Iwasawa, Masaki
    Namekata, Daisuke
    Sakamoto, Ryo
    Nakamura, Takashi
    Kimura, Yasuyuki
    Nitadori, Keigo
    Wang, Long
    Tsubouchi, Miyuki
    Makino, Jun
    Liu, Zhao
    Fu, Haohuan
    Yang, Guangwen
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2020, 34 (06): : 615 - 628
  • [6] A massively parallel, multi-disciplinary Barnes-Hut tree code for extreme-scale N-body simulations
    Winkel, Mathias
    Speck, Robert
    Huebner, Helge
    Arnold, Lukas
    Krause, Rolf
    Gibbon, Paul
    COMPUTER PHYSICS COMMUNICATIONS, 2012, 183 (04) : 880 - 889
  • [7] Fast Multipole Methods for N-body Simulations of Collisional Star Systems
    Mukherjee, Diptajyoti
    Zhu, Qirong
    Trac, Hy
    Rodriguez, Carl L.
    ASTROPHYSICAL JOURNAL, 2021, 916 (01):
  • [8] A novelmultiple-walk parallel algorithm for the Barnes-Hut treecode on GPUs - towards cost effective, high performance N-body simulation
    Hamada, Tsuyoshi
    Nitadori, Keigo
    Benkrid, Khaled
    Ohno, Yousuke
    Morimoto, Gentaro
    Masada, Tomonari
    Shibata, Yuichiro
    Oguri, Kiyoshi
    Taiji, Makoto
    COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2009, 24 (1-2): : 21 - 31
  • [9] A data-parallel implementation of hierarchical N-body methods
    Hu, Y
    Johnsson, SL
    INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1996, 10 (01): : 3 - 40
  • [10] Data-parallel implementation of hierarchical N-body methods
    Harvard Univ, Cambridge, United States
    Int J Supercomput Appl High Perform Comput, 1 (3-40):