Avoiding Conversion and Rearrangement Overhead in SIMD Architectures

被引：0

作者：

Asadollah Shahbahrami

Ben Juurlink

Demid Borodin

Stamatis Vassiliadis

机构：

[1] Delft University of Technology,Computer Engineering Laboratory, Faculty of Electrical Engineering, Mathematics, and Computer Science

[2] Guilan University,Department of Electrical and Computer Engineering, Faculty of Engineering

来源：

International Journal of Parallel Programming | 2006年 / 34卷

关键词：

Embedded media processors; multimedia kernels; register file; subword parallelism;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Single-Instruction Multiple-Data (SIMD) instructions provide an inexpensive way to exploit the Data-Level Parallelism in multimedia applications. However, the performance improvement obtained by employing SIMD instructions is often limited because frequently many overhead instructions are required to bring data in a form amenable to SIMD processing. In this paper, we employ two techniques to overcome this limitation. The first technique, extended subwords, uses four extra bits for every byte in a media register. This allows many SIMD operations to be performed without overflow and avoids packing/unpacking conversion overhead. The second technique, Matrix Register File (MRF), allows flexible row-wise as well as column-wise access to the register file. It is useful for many two-dimensional multimedia algorithms such as the (I) Discrete Cosine Transform, 2 × 2 Haar Transform, and pixel padding. In addition, we propose a few new media instructions. Experimental results obtained by extending the SimpleScalar toolset show that these techniques improve performance by up to a factor of 4.5 compared to a conventional SIMD instruction set extension.

引用

页码：237 / 260

页数：23

共 50 条

[11] Performance characterization and comparison of SIMD architectures
Onbasioglu, E
Paker, Y
INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-III, PROCEEDINGS, 1997, : 1418 - 1422
[12] A parallel join algorithm for SIMD architectures
Azadegan, S
Tripathi, A
JOURNAL OF SYSTEMS AND SOFTWARE, 1997, 39 (03) : 265 - 280
[13] Avoiding overhead aversion in charity
Gneezy, Uri
Keenan, Elizabeth A.
Gneezy, Ayelet
SCIENCE, 2014, 346 (6209) : 632 - 635
[14] SV: Enhancing SIMD Architectures via Combined SIMD-Vector Approach
Huang, Libo
Wang, Zhiying
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PT 1, PROCEEDINGS, 2010, 6081 : 226 - 235
[15] Improving Neural Network Performance on SIMD Architectures
Limonova, Elena
Ilin, Dmitry
Nikolaev, Dmitry
EIGHTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2015), 2015, 9875
[16] Memory Partition for SIMD in Streaming Dataflow Architectures
Shen, Xiaowei
Ye, Xiaochun
Tan, Xu
Wang, Da
Zhang, Zhimin
Tang, Zhimin
Fan, Dongrui
2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2016,
[17] ON THE SEMANTICS OF LANGUAGES FOR MASSIVELY PARALLEL SIMD ARCHITECTURES
BOUGE, L
LECTURE NOTES IN COMPUTER SCIENCE, 1991, 506 : 166 - 183
[18] Efficient tree codes on SIMD computer architectures
Olson, KM
COMPUTER PHYSICS COMMUNICATIONS, 1996, 98 (03) : 267 - 287
[19] Issues in the design of high performance SIMD architectures
Allen, JD
Schimmel, DE
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (08) : 818 - 829
[20] Control flow emulation on tiled SIMD Architectures
Lashari, Ghulam
Lhotak, Ondrej
McCool, Michael
COMPILER CONSTRUCTION, 2008, 4959 : 100 - 115

← 1 2 3 4 5 →