Exploiting vector parallelism in software pipelined loops

被引：0

作者：

Larsen, S ^{[1
]}

Rabbah, R ^{[1
]}

Amarasinghe, S ^{[1
]}

机构：

[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA

来源：

MICRO-38: PROCEEDINGS OF THE 38TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUMN ON MICROARCHITECTURE | 2005年

关键词：

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditional vectorization technology first developed for supercomputers. In contrast, scalar hardware is typically targeted using ILP techniques such as software pipelining. This paper presents a novel approach for exploiting vector parallelism in software pipelined loops. The proposed methodology (i) lowers the burden on the scalar resources by offloading computation to the vector functional units, (ii) explicitly manages communication of operands between scalar and vector instructions, (iii) naturally handles misaligned vector memory operations, and (iv) partially (or fully) inhibits the optimization when vectorization will decrease performance. Our approach results in better resource utilization and allows for software pipelining with shorter initiation intervals. The proposed optimization is applied in the compiler backend, where vectorization decisions are more amenable to cost analysis. This is unique in that traditional vectorization optimizations are usually carried out at the statement level. Although our technique most naturally complements statically scheduled machines, we believe it is applicable to any architecture that tightly integrates support for instruction and data level parallelism. We evaluate our methodology using nine SPEC FP benchmarks. In comparison to software pipelining, our approach achieves a maximum speedup of 1.38 x, with an average of 1.11 x.

引用

页码：119 / 129

页数：11

共 50 条

[1] EXPLOITING THE PARALLELISM AVAILABLE IN LOOPS
LILJA, DJ
COMPUTER, 1994, 27 (02) : 13 - 26
[2] Software prefetching for software pipelined loops
Sanchez, FJ
Gonzalez, A
PROCEEDINGS OF THE THIRTY-FIRST HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOL VII: SOFTWARE TECHNOLOGY TRACK, 1998, : 778 - 779
[3] Software data prefetching for software pipelined loops
Sánchez, J
González, A
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1999, 58 (02) : 236 - 259
[4] REGISTER ALLOCATION FOR SOFTWARE PIPELINED LOOPS
RAU, BR
LEE, M
TIRUMALAI, PP
SCHLANSKER, MS
SIGPLAN NOTICES, 1992, 27 (07): : 283 - 299
[5] Exploiting Task-based Parallelism in Application Loops
Cui, Han
Dahnoun, Naim
2019 8TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2019, : 717 - 721
[6] Register allocation for software pipelined multidimensional loops
Rong, Hongbo
Douillet, Alban
Gao, Guang R.
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2008, 30 (04):
[7] Control Flow Regeneration for Software Pipelined Loops with Conditions
Dragan Milicev
Zoran Jovanovic
International Journal of Parallel Programming, 2002, 30 : 149 - 179
[8] Quantitative evaluation of register pressure on software pipelined loops
Llosa, J
Ayguade, E
Valero, M
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1998, 26 (02) : 121 - 142
[9] Control flow regeneration for software pipelined loops with conditions
Milicev, D
Jovanovic, Z
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2002, 30 (03) : 149 - 179
[10] Improved spill code generation for software pipelined loops
Zalamea, J
Llosa, J
Ayguadé, E
Valero, M
ACM SIGPLAN NOTICES, 2000, 35 (05) : 134 - 144

← 1 2 3 4 5 →