Locality optimizations for Jacobi iteration on distributed parallel systems

被引:0
|
作者
Che, YG [1 ]
Wang, ZH
Li, XM
Yang, LT
机构
[1] Natl Univ Def Technol, Sch Comp, Changsha 410073, Peoples R China
[2] St Francis Xavier Univ, Dept Comp Sci, Antigonish, NS B2G 2W5, Canada
[3] Inst Equipment & Command Technol, Beijing, Peoples R China
来源
PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS | 2004年 / 3358卷
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we propose an inter-nest cache reuse optimization method for Jacobi codes. This method is easy to apply, but effective in that it enhances cache locality of the Jacobi codes while preserving their coarse grain parallelism. We compare our method to two previous locality enhancement techniques that can be used for Jacobi codes: time skewing and new tiling. We quantitatively calculate the main contributing factors to the runtime of different Jacobi codes. We also perform experiments on a PC cluster to verify our analysis. The results show that our method performs poorer than time skewing and new tiling for uniprocessor, but performs better for distributed parallel system.
引用
收藏
页码:91 / 104
页数:14
相关论文
共 50 条
  • [41] Decision-aided Jacobi iteration for signal detection in massive MIMO systems
    Lee, Yinman
    ELECTRONICS LETTERS, 2017, 53 (23) : 1552 - 1553
  • [42] Link adaptation with distributed Jacobi eigenbeamforming for MIMO systems
    Zacarias, Eduardo B.
    Werner, Stefan
    Wichman, Risto
    2007 FOURTH INTERNATIONAL SYMPOSIUM ON WIRELESS COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 641 - 645
  • [43] NUMERICAL PERFORMANCE OF AN ASYNCHRONOUS JACOBI ITERATION
    BULL, JM
    FREEMAN, TL
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 634 : 361 - 366
  • [44] Implementing Asynchronous Jacobi Iteration on GPUs
    Tsai, Yu-Hsiang Mike
    Nayak, Pratik
    Chow, Edmond
    Anzt, Hartwig
    Proceedings of ScalAH 2022: 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis, 2022, : 1 - 9
  • [45] Scalability in distributed systems, parallel systems and supercomputers
    Kremien, O
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1995, 919 : 532 - 541
  • [46] Locality-aware Optimizations for Improving Remote Memory Latency in Multi-GPU Systems
    Belayneh, Leul
    Ye, Haojie
    Chen, Kuan-Yu
    Blaauw, David
    Mudge, Trevor
    Dreslinski, Ronald
    Talati, Nishil
    PROCEEDINGS OF THE 2022 31ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT 2022, 2022, : 304 - 316
  • [47] PARALLEL DISTRIBUTED-PROCESSING CHALLENGES THE STRONG MODULARITY HYPOTHESIS, NOT THE LOCALITY ASSUMPTION
    PLAUT, DC
    BEHAVIORAL AND BRAIN SCIENCES, 1994, 17 (01) : 77 - 78
  • [48] An efficient parallel iteration algorithm for nonlinear diffusion equations with time extrapolation techniques and the Jacobi explicit scheme
    Miao, Shuai
    Yao, Yanzhong
    Lv, Guixia
    JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 441
  • [49] An efficient parallel iteration algorithm for nonlinear diffusion equations with time extrapolation techniques and the Jacobi explicit scheme
    Miao, Shuai
    Yao, Yanzhong
    Lv, Guixia
    Journal of Computational Physics, 2021, 441