Population-Based MCMC on Multi-Core CPUs, GPUs and FPGAs

被引:10
|
作者
Mingas, Grigorios [1 ]
Bouganis, Christos-Savvas [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Elect & Elect Engn, London SW7 2AZ, England
关键词
Field programmable gate array; graphics processing unit; Markov Chain Monte Carlo; parallel tempering; custom arithmetic precision; PARALLEL; SIMULATION; PRECISION;
D O I
10.1109/TC.2015.2439256
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Markov Chain Monte Carlo (MCMC) is a method to draw samples from a given probability distribution. Its frequent use for solving probabilistic inference problems, where big-scale data are repeatedly processed, means that MCMC runtimes can be unacceptably large. This paper focuses on population-based MCMC, a popular family of computationally intensive MCMC samplers; we propose novel, highly optimized accelerators in three parallel hardware platforms (multi-core CPUs, GPUs and FPGAs), in order to address the performance limitations of sequential software implementations. For each platform, we jointly exploit the nature of the underlying hardware and the special characteristics of population-based MCMC. We focus particularly on the use of custom arithmetic precision, introducing two novel methods which employ custom precision in the largest part of the algorithm in order to reduce runtime, without causing sampling errors. We apply these methods to all platforms. The FPGA accelerators are up to 114x faster than multi-core CPUs and up to 53x faster than GPUs when doing inference on mixture models.
引用
收藏
页码:1283 / 1296
页数:14
相关论文
共 50 条
  • [1] A General Framework for Accelerating Swarm Intelligence Algorithms on FPGAs, GPUs and Multi-Core CPUs
    Li, Dalin
    Huang, Lan
    Wang, Kangping
    Pang, Wei
    Zhou, You
    Zhang, Rui
    [J]. IEEE ACCESS, 2018, 6 : 72327 - 72344
  • [2] Fast and Parallel Computation of the Discrete Periodic Radon Transform on GPUs, multi-core CPUs and FPGAs
    Carranza, Cesar
    Pattichis, Marios
    Llamocca, Daniel
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 4158 - 4162
  • [3] PARALLEL SPN ON MULTI-CORE CPUS AND MANY-CORE GPUS
    Kirschenmann, W.
    Plagne, L.
    Poncot, A.
    Vialle, S.
    [J]. TRANSPORT THEORY AND STATISTICAL PHYSICS, 2010, 39 (2-4): : 255 - 281
  • [4] Scalable Multi-coloring Preconditioning for Multi-core CPUs and GPUs
    Heuveline, Vincent
    Lukarski, Dimitar
    Weiss, Jan-Philipp
    [J]. EURO-PAR 2010 PARALLEL PROCESSING WORKSHOPS, 2011, 6586 : 389 - 397
  • [5] Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs
    Ferreira, Diogo R.
    Santos, Rui M.
    [J]. BUSINESS PROCESS MANAGEMENT WORKSHOPS, BPM 2016, 2017, 281 : 36 - 48
  • [6] Challenges and Opportunities of Obtaining Performance from Multi-Core CPUs and Many-Core GPUs
    Chen, Trista P.
    Chen, Yen-Kuang
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 613 - +
  • [7] Parallelization Strategies of the Canny Edge Detector for Multi-core CPUs and Many-core GPUs
    Ben Cheikh, Taieb Lamine
    Beltrame, Giovanni
    Nicolescu, Gabriela
    Cheriet, Farida
    Tahar, Sofiene
    [J]. 2012 IEEE 10TH INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS), 2012, : 49 - 52
  • [8] Parallel online spatial and temporal aggregations on multi-core CPUs and many-core GPUs
    Zhang, Jianting
    You, Simin
    Gruenwald, Le
    [J]. INFORMATION SYSTEMS, 2014, 44 : 134 - 154
  • [9] Accelerating subset sum and lattice based public-key cryptosystems with multi-core CPUs and GPUs
    Al Badawi, Ahmad
    Veeravalli, Bharadwaj
    Aung, Khin Mi Mi
    Hamadicharef, Brahim
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 119 : 179 - 190
  • [10] Accelerated parallel computation of field quantities for the boundary element method applied to stress analysis using multi-core CPUs, GPUs and FPGAs
    Gu, Junjie
    Zsaki, Attila Michael
    [J]. COGENT ENGINEERING, 2018, 5 (01): : 1 - 21