Generation of permutations for SIMD processors

被引:4
|
作者
Kudriavtsev, A [1 ]
Kogge, P [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
关键词
SIMD; permutations;
D O I
10.1145/1070891.1065931
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Short vector (SIMD) instructions are useful in signal processing, multimedia, and scientific applications. They offer higher performance, lower energy consumption, and better resource utilization. However, compilers still do not have good support for SIMD instructions, and often the code has to be written manually in assembly language or using compiler builtin functions. Also, in some applications, higher parallelism could be achieved if compilers inserted permutation instructions that reorder the data in registers. In this paper we describe how we create SIMD instructions from regular code, and determine ordering of individual operations in the SIMD instructions to minimize the number of permutation instructions. Individual memory operations are grouped into SIMD operations based on their effective addresses. The SIMD data flow graph is then constructed by following data dependences from SIMD memory operations. Then, the orderings of operations are propagated from SIMD memory operations into the graph. We also describe our approach to compute decomposition of a given permutation into the permutation instructions of the target architecture. Experiments with our prototype compiler show that this approach scales well with the number of operations in SIMD instructions (SIMD width) and can be used to compile a number of important kernels, achieving up to 35 % speedup.
引用
收藏
页码:147 / 156
页数:10
相关论文
共 50 条
  • [1] Generating SIMD vectorized permutations
    Franchetti, Franz
    Pueschel, Markus
    COMPILER CONSTRUCTION, 2008, 4959 : 116 - 131
  • [2] DC-SIMD: Dynamic communication for SIMD processors
    Frijns, Raymond
    Fatemi, Hamed
    Mesman, Bart
    Corporaal, Henk
    2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 1436 - 1445
  • [3] SYNTHESIS OF DEDICATED SIMD PROCESSORS
    AUGUIN, M
    BOERI, F
    CARRIERE, C
    MENEZ, G
    JOURNAL OF VLSI SIGNAL PROCESSING, 1995, 9 (03): : 167 - 179
  • [4] Optimizing data permutations for SIMD devices
    Ren, Gang
    Wu, Peng
    Padua, David
    ACM SIGPLAN NOTICES, 2006, 41 (06) : 118 - 131
  • [5] Design and automatic code generation of the LMS algorithm for SIMD signal processors.
    Robelly, JP
    Cichon, G
    Seidel, H
    Fettweis, G
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 81 - 84
  • [6] Efficient SIMD optimization for media processors
    Zhou, Jian-Peng
    Shi, Ce
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2008, 9 (04): : 524 - 530
  • [7] On dependence analysis for SIMD enhanced processors
    Bulic, P
    Gustin, V
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2004, 2005, 3402 : 527 - 540
  • [8] Efficient SIMD optimization for media processors
    Jian-peng Zhou
    Ce Shi
    Journal of Zhejiang University-SCIENCE A, 2008, 9 : 524 - 530
  • [9] Compiler optimizations for processors with SIMD instructions
    Pryanishnikov, Ivan
    Krall, Andreas
    Horspool, Nigel
    SOFTWARE-PRACTICE & EXPERIENCE, 2007, 37 (01): : 93 - 113