Exploiting vector parallelism in software pipelined loops

被引:0
|
作者
Larsen, S [1 ]
Rabbah, R [1 ]
Amarasinghe, S [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
来源
MICRO-38: PROCEEDINGS OF THE 38TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUMN ON MICROARCHITECTURE | 2005年
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditional vectorization technology first developed for supercomputers. In contrast, scalar hardware is typically targeted using ILP techniques such as software pipelining. This paper presents a novel approach for exploiting vector parallelism in software pipelined loops. The proposed methodology (i) lowers the burden on the scalar resources by offloading computation to the vector functional units, (ii) explicitly manages communication of operands between scalar and vector instructions, (iii) naturally handles misaligned vector memory operations, and (iv) partially (or fully) inhibits the optimization when vectorization will decrease performance. Our approach results in better resource utilization and allows for software pipelining with shorter initiation intervals. The proposed optimization is applied in the compiler backend, where vectorization decisions are more amenable to cost analysis. This is unique in that traditional vectorization optimizations are usually carried out at the statement level. Although our technique most naturally complements statically scheduled machines, we believe it is applicable to any architecture that tightly integrates support for instruction and data level parallelism. We evaluate our methodology using nine SPEC FP benchmarks. In comparison to software pipelining, our approach achieves a maximum speedup of 1.38 x, with an average of 1.11 x.
引用
收藏
页码:119 / 129
页数:11
相关论文
共 50 条
  • [1] EXPLOITING THE PARALLELISM AVAILABLE IN LOOPS
    LILJA, DJ
    COMPUTER, 1994, 27 (02) : 13 - 26
  • [2] Software prefetching for software pipelined loops
    Sanchez, FJ
    Gonzalez, A
    PROCEEDINGS OF THE THIRTY-FIRST HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOL VII: SOFTWARE TECHNOLOGY TRACK, 1998, : 778 - 779
  • [3] Software data prefetching for software pipelined loops
    Sánchez, J
    González, A
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1999, 58 (02) : 236 - 259
  • [4] REGISTER ALLOCATION FOR SOFTWARE PIPELINED LOOPS
    RAU, BR
    LEE, M
    TIRUMALAI, PP
    SCHLANSKER, MS
    SIGPLAN NOTICES, 1992, 27 (07): : 283 - 299
  • [5] Exploiting Task-based Parallelism in Application Loops
    Cui, Han
    Dahnoun, Naim
    2019 8TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2019, : 717 - 721
  • [6] Register allocation for software pipelined multidimensional loops
    Rong, Hongbo
    Douillet, Alban
    Gao, Guang R.
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2008, 30 (04):
  • [7] Control Flow Regeneration for Software Pipelined Loops with Conditions
    Dragan Milicev
    Zoran Jovanovic
    International Journal of Parallel Programming, 2002, 30 : 149 - 179
  • [8] Quantitative evaluation of register pressure on software pipelined loops
    Llosa, J
    Ayguade, E
    Valero, M
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1998, 26 (02) : 121 - 142
  • [9] Control flow regeneration for software pipelined loops with conditions
    Milicev, D
    Jovanovic, Z
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2002, 30 (03) : 149 - 179
  • [10] Improved spill code generation for software pipelined loops
    Zalamea, J
    Llosa, J
    Ayguadé, E
    Valero, M
    ACM SIGPLAN NOTICES, 2000, 35 (05) : 134 - 144