Acceleration of large-scale CGH generation using multi-GPU cluster

被引:1
|
作者
Watanabe, Shinpei [1 ]
Jackin, Boaz Jessie [2 ]
Ohkawa, Takeshi [1 ]
Ootsu, Kanemitsu [1 ]
Yokota, Takashi [1 ]
Hayasaki, Yoshio [3 ]
Yatagai, Toyohiko [3 ]
Baba, Takanobu [3 ]
机构
[1] Utsunomiya Univ, Grad Sch Engn, Dept Informat Syst Sci, 7-1-2 Yoto, Utsunomiya, Tochigi 3218585, Japan
[2] Natl Inst Informat & Commun Technol, 4-2-1 Nukuikitamachi, Koganei, Tokyo 1848795, Japan
[3] Utsunomiya Univ, Ctr Opt Res & Educ, 7-1-2 Yoto, Utsunomiya, Tochigi 3218585, Japan
关键词
CGH; multi-GPU; cluster; object decomposition method; optimization;
D O I
10.1109/CANDAR.2017.53
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Computer generated hologram (CGH) is a promising technology for realizing 3D displays. Large-scale CGH has an advantage that it resolves problems of existing 3D displays. However, the large-scale CGH generation requires a lot of memory space and computation time in proportion to pixel number. Further, in order to use CGH as a display, it needs to be generated in real time, and this is the reason why CGH does not suit to practical use. Computation of CGH is comprised of data-independent operations and current GPU has thousands of processing cores. Thus, acceleration of CGH generation can be expected by using GPU. To accelerate CGH generation processing, we adapt several parallelization and optimization techniques to the CGH program both for single node and multiple ones. The single node optimization techniques include the way of object decomposition, the reduction of data transfer amount between CPU and GPU, the kernel integration, stream processing, and the utilization of multi-GPU parallelism. The multi-node optimization includes inter-node data distribution method. The results show that we have achieved 134.7 times speed-up compared to sequential program execution by CPU.
引用
收藏
页码:589 / 593
页数:5
相关论文
共 50 条
  • [31] Multi-GPU parallel acceleration scheme for meshfree peridynamic simulations
    Wang, Xiaoming
    Li, Shirui
    Dong, Weijia
    An, Boyang
    Huang, Hong
    He, Qing
    Wang, Ping
    Lv, Guanren
    THEORETICAL AND APPLIED FRACTURE MECHANICS, 2024, 131
  • [32] Efficient Multi-GPU Memory Management for Deep Learning Acceleration
    Kim, Youngrang
    Lee, Jaehwan
    Kim, Jik-Soo
    Jei, Hyunseung
    Roh, Hongchan
    2018 IEEE 3RD INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W), 2018, : 37 - 43
  • [33] Fast STA Graph Partitioning Framework for Multi-GPU Acceleration
    Guo, Guannan
    Huang, Tsung-Wei
    Wong, Martin
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [34] A multi-GPU acceleration for 3D imaging of the prostate
    Attardo, E.A.
    Borsic, A.
    Halter, R.J.
    Proceedings - 2011 International Conference on Electromagnetics in Advanced Applications, ICEAA'11, 2011, : 1096 - 1099
  • [35] Acceleration of Large-Scale FDTD Simulations on High Performance GPU Clusters
    Ong, C.
    Weldon, M.
    Cyca, D.
    Okoniewski, M.
    2009 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM AND USNC/URSI NATIONAL RADIO SCIENCE MEETING, VOLS 1-6, 2009, : 545 - 548
  • [36] GPU Acceleration of Large-Scale Full-Frequency GW Calculations
    Yu, Victor Wen-zhe
    Govoni, Marco
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (08) : 4690 - 4707
  • [37] A PCISPH implementation using distributed multi-GPU acceleration for simulating industrial engineering applications
    Verma, Kevin
    McCabe, Christopher
    Peng, Chong
    Wille, Robert
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2020, 34 (04): : 450 - 464
  • [38] CUSNTF: A Scalable Sparse Non-negative Tensor Factorization Model for Large-scale Industrial Applications on Multi-GPU
    Li, Hao
    Li, Kenli
    An, Jiyao
    Li, Keqin
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1113 - 1122
  • [39] Divide et impera: Acceleration of DTI tractography using multi-GPU parallel processing
    Lee, Jungsoo
    Kim, Dae-Shik
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2013, 23 (03) : 256 - 264
  • [40] An Efficient Parallelization Approach for Large-scale Sparse Non-negative Matrix Factorization Using Kullback-Leibler Divergence on Multi-GPU
    Li, Hao
    Li, Kenli
    Peng, Jiwu
    Hu, Junyan
    Li, Keqin
    2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 511 - 518