Evaluation of Flash-based Out-of-core Stencil Computation Algorithms for SSD-Equipped Clusters

被引:0
|
作者
Midorikawa, Hiroko [1 ]
Tan, Hideyuki [1 ]
机构
[1] Seikei Univ, Dept Comp & Informat Sci, JST CREST, Tokyo, Japan
关键词
Non-volatile memory; flash memory; memory hierarchy; temporal blocking; stencil; out-of-core; asynchronous I/O; access locality; NUMA; auto-tuning; SSD-equipped cluster;
D O I
10.1109/ICPADS.2016.135
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a new scheme for solving data size requirements for a large-scale stencil computation, which are greater than the total size of the main memories of nodes in a cluster. It utilizes distributed flash SSDs over cluster nodes as an extension to the main memory with a locality-aware algorithm. Three algorithms with a different hierarchical blocking scheme for three memory tiers, namely, flash SSD, DRAM, and cache, are proposed, and they are evaluated in different platforms and flash devices. They utilize not only highly parallel asynchronous input/output in flash SSDs, but also appropriate blocking parameters by using an auto-tuning system named Blk-Tune. They also overcome the performance degradation caused by the non-uniform memory architecture (NUMA). The optimized algorithms for single nodes are extended for multi-nodes and evaluated in a cluster with traditional SATA SSDs, as well as with state-of-the-art flash devices, such as low-power and cost-effective M.2 NVMe flash SSDs. With the use of our scheme and distributed flash devices in a cluster, large-scale stencil problems can be solved with a limited number of nodes and a moderate size of main memories.
引用
收藏
页码:1031 / 1040
页数:10
相关论文
共 14 条
  • [1] An Out-of-core Eigensolver on SSD-equipped Clusters
    Zhou, Zheng
    Saule, Erik
    Aktulga, Hasan Metin
    Yang, Chao
    Ng, Esmond G.
    Maris, Pieter
    Vary, James P.
    Catalyuerek, Uemit V.
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 248 - 256
  • [2] An Extension of OpenACC Directives for Out-of-Core Stencil Computation with Temporal Blocking
    Miki, Nobuhiro
    Ino, Fumihiko
    Hagihara, Kenichi
    [J]. PROCEEDINGS OF WACCPD 2016: THIRD WORKSHOP ON ACCELERATOR PROGRAMMING USING DIRECTIVES, 2016, : 36 - 45
  • [3] Accelerating GPU-Based Out-of-Core Stencil Computation with On-the-Fly Compression
    Shen, Jingcheng
    Wu, Yifan
    Okita, Masao
    Ino, Fumihiko
    [J]. PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PDCAT 2021, 2022, 13148 : 3 - 14
  • [4] A compression-based memory-efficient optimization for out-of-core GPU stencil computation
    Shen, Jingcheng
    Long, Linbo
    Deng, Xin
    Okita, Masao
    Ino, Fumihiko
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (10): : 11055 - 11077
  • [5] A compression-based memory-efficient optimization for out-of-core GPU stencil computation
    Jingcheng Shen
    Linbo Long
    Xin Deng
    Masao Okita
    Fumihiko Ino
    [J]. The Journal of Supercomputing, 2023, 79 : 11055 - 11077
  • [6] Blk-Tune: Blocking Parameter Auto-Tuning to Minimize Input-Output Traffic for Flash-based Out-of-Core Stencil Computations
    Midorikawa, Hiroko
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 1516 - 1526
  • [7] A Data-Centric Directive-Based Framework to Accelerate Out-of-Core Stencil Computation on a GPU
    Shen, Jingcheng
    Ino, Fumihiko
    Farres, Albert
    Hanzich, Mauricio
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (12): : 2421 - 2434
  • [8] A data-centric directive-based framework to accelerate out-of-core stencil computation on a GPU
    Shen, Jingcheng
    Ino, Fumihiko
    Farrés, Albert
    Hanzich, Mauricio
    [J]. IEICE Transactions on Information and Systems, 2020, E103D (12): : 2421 - 2434
  • [9] An Overview of Video Allocation Algorithms for Flash-based SSD Storage Systems
    Al-Sabateen, Jaafer
    Alomari, Saleh Ali
    Sumari, Putra
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2012, 3 (03) : 20 - 25
  • [10] Realizing Out-of-Core Stencil Computations using Multi-Tier Memory Hierarchy on GPGPU Clusters
    Endo, Toshio
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 21 - 29