Zero-Overhead Parallel Scans for Multi-Core CPUs

被引:1
|
作者
de Wolff, Ivo Gabe [1 ]
van Balen, David P. [1 ]
Keller, Gabriele K. [1 ]
McDonell, Trevor L. [1 ]
机构
[1] Univ Utrecht, Utrecht, Netherlands
关键词
D O I
10.1145/3649169.3649248
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present three novel parallel scan algorithms for multi-core CPUs which do not need to fix the number of available cores at the start, and have zero overhead compared to sequential scans when executed on a single core. These two properties are in contrast with most existing parallel scan algorithms, which are asymptotically optimal, but have a constant factor overhead compared to sequential scans when executed on a single core. We achieve these properties by adapting the classic three-phase scan algorithms. The resulting algorithms also exhibit better performance than the original ones on multiple cores. Furthermore, we adapt the chained scan with decoupled look-back algorithm to also have these two properties. While this algorithm was originally designed for GPUs, we show it is also suitable for multi-core CPUs, outperforming the classic three-phase scans in our benchmarks, by better using the caches of the processor at the cost of more synchronisation. In general our adaptive chained scan is the fastest parallel scan, but in specific situations our assisted reduce-then-scan is better.
引用
收藏
页码:52 / 61
页数:10
相关论文
共 50 条
  • [1] A Parallel SPH Implementation on Multi-Core CPUs
    Ihmsen, Markus
    Akinci, Nadir
    Becker, Markus
    Teschner, Matthias
    [J]. COMPUTER GRAPHICS FORUM, 2011, 30 (01) : 99 - 112
  • [2] Optimization of FFT parallel algorithm on multi-core CPUS
    [J]. Dong, Fang Ai, 1600, UK Simulation Society, Clifton Lane, Nottingham, NG11 8NS, United Kingdom (17):
  • [3] PARALLEL SPN ON MULTI-CORE CPUS AND MANY-CORE GPUS
    Kirschenmann, W.
    Plagne, L.
    Poncot, A.
    Vialle, S.
    [J]. TRANSPORT THEORY AND STATISTICAL PHYSICS, 2010, 39 (2-4): : 255 - 281
  • [5] Parallel ant colony optimization on multi-core SIMD CPUs
    Zhou, Yi
    He, Fazhi
    Hou, Neng
    Qiu, Yimin
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 79 : 473 - 487
  • [6] Performance Analysis of Parallel Smoothed Particle Hydrodynamics on Multi-core CPUs
    Chen Wenbo
    Yao, Yucheng
    Zhang, Yang
    [J]. 2014 International Conference on Cloud Computing and Internet of Things (CCIOT), 2014, : 85 - 90
  • [7] Optimizing image processing on multi-core CPUs with Intel parallel programming technologies
    Kim, Cheong Ghil
    Kim, Jeom Goo
    Lee, Do Hyeon
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 68 (02) : 237 - 251
  • [8] Parallel convolution algorithm using implicit matrix multiplication on multi-core CPUs
    Wang, Qinglin
    Mei, Songzhu
    Liu, Jie
    Gong, Chunye
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [9] Optimizing image processing on multi-core CPUs with Intel parallel programming technologies
    Cheong Ghil Kim
    Jeom Goo Kim
    Do Hyeon Lee
    [J]. Multimedia Tools and Applications, 2014, 68 : 237 - 251
  • [10] Parallel online spatial and temporal aggregations on multi-core CPUs and many-core GPUs
    Zhang, Jianting
    You, Simin
    Gruenwald, Le
    [J]. INFORMATION SYSTEMS, 2014, 44 : 134 - 154