Population-Based MCMC on Multi-Core CPUs, GPUs and FPGAs

被引:10
|
作者
Mingas, Grigorios [1 ]
Bouganis, Christos-Savvas [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Elect & Elect Engn, London SW7 2AZ, England
关键词
Field programmable gate array; graphics processing unit; Markov Chain Monte Carlo; parallel tempering; custom arithmetic precision; PARALLEL; SIMULATION; PRECISION;
D O I
10.1109/TC.2015.2439256
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Markov Chain Monte Carlo (MCMC) is a method to draw samples from a given probability distribution. Its frequent use for solving probabilistic inference problems, where big-scale data are repeatedly processed, means that MCMC runtimes can be unacceptably large. This paper focuses on population-based MCMC, a popular family of computationally intensive MCMC samplers; we propose novel, highly optimized accelerators in three parallel hardware platforms (multi-core CPUs, GPUs and FPGAs), in order to address the performance limitations of sequential software implementations. For each platform, we jointly exploit the nature of the underlying hardware and the special characteristics of population-based MCMC. We focus particularly on the use of custom arithmetic precision, introducing two novel methods which employ custom precision in the largest part of the algorithm in order to reduce runtime, without causing sampling errors. We apply these methods to all platforms. The FPGA accelerators are up to 114x faster than multi-core CPUs and up to 53x faster than GPUs when doing inference on mixture models.
引用
收藏
页码:1283 / 1296
页数:14
相关论文
共 50 条
  • [31] Island-based Differential Evolution with Panmictic Migration for Multi-core CPUs
    Tagawa, Kiyoharu
    Nakajima, Kenichi
    [J]. 2013 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2013, : 852 - 859
  • [32] LISFLOOD-FP 8.0: the new discontinuous Galerkin shallow-water solver for multi-core CPUs and GPUs
    Shaw, James
    Kesserwani, Georges
    Neal, Jeffrey
    Bates, Paul
    Sharifian, Mohammad Kazem
    [J]. GEOSCIENTIFIC MODEL DEVELOPMENT, 2021, 14 (06) : 3577 - 3602
  • [33] Optimizing Satellite Monitoring of Volcanic Areas Through GPUs and Multi-Core CPUs Image Processing: An OpenCL Case Study
    Bilotta, Giuseppe
    Sanchez, Ricardo Zanmar
    Ganci, Gaetana
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2013, 6 (06) : 2445 - 2452
  • [34] BRAM-based function reuse for multi-core architectures in FPGAs
    Exenberger Becker, Pedro H.
    Sartor, Anderson L.
    Brandalero, Marcelo
    Schneider Beck, Antonio C.
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2018, 63 : 237 - 248
  • [35] Enhancing the scalability and memory usage of HashSieve on multi-core CPUs
    Mariano, Artur
    Bischof, Christian
    [J]. 2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP), 2016, : 545 - 552
  • [37] Main-Memory Scan Sharing For Multi-Core CPUs
    Qiao, Lin
    Raman, Vijayshankar
    Reiss, Frederick
    Haas, Peter J.
    Lohman, Guy M.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 610 - 621
  • [38] An application-centric evaluation of OpenCL on multi-core CPUs
    Shen, Jie
    Fang, Jianbin
    Sips, Henk
    Varbanescu, Ana Lucia
    [J]. PARALLEL COMPUTING, 2013, 39 (12) : 834 - 850
  • [39] MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs
    Stratton, John A.
    Stone, Sam S.
    Hwu, Wen-mei W.
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2008, 5335 : 16 - +
  • [40] Parallel ant colony optimization on multi-core SIMD CPUs
    Zhou, Yi
    He, Fazhi
    Hou, Neng
    Qiu, Yimin
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 79 : 473 - 487