Enhancing the performance of 16-bit code using augmenting instructions

被引：3

作者：

Krishnaswamy, A ^{[1
]}

Gupta, R ^{[1
]}

机构：

[1] Univ Arizona, Dept Comp Sci, Tucson, AZ 85721 USA

来源：

ACM SIGPLAN NOTICES | 2003年 / 38卷 / 07期

关键词：

algorithms; measurement; performance; embedded processor; 32-bit ARM ISA; 16-bit Thumb ISA; code size; AX instructions; instruction coalescing;

D O I：

10.1145/780731.780767

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In the embedded domain, memory usage and energy consumption are critical constraints. Dual width instruction set embedded processors such as the ARM provide a 16-bit instruction set in addition to the 32-bit instruction set to address these concerns. Using 16-bit instructions one can achieve code size reduction and I-cache energy savings at the cost of performance. We have observed that throughout 16-bit Thumb code there exist Thumb instruction pairs that are equivalent to a single ARM instruction. We have developed an approach which uses combination of compiler and architectural support to exploit the above property for improving performance of 16-bit code. We enhance the Thumb instruction set by incorporating Augmenting eXtensions (AX). The task of the compiler is to identify pairs of Thumb instructions that can be safely combined and executed as-single ARM instructions. The compiler replaces such pairs of Thumb, instructions by AX+Thumb instruction pairs. The AX instruction is coalesced with the immediately following Thumb instruction to generate a single ARM instruction at decode time. Thus, using AX instructions, the compiler can both generate compact 16-bit code and provide hardware with information needed to produce better performing 32-bit code.

引用

页码：254 / 264

页数：11

共 50 条

[1] 16-bit floating point instructions for embedded multimedia applications
Lacassagne, L
Etiemble, D
Kablia, SAO
[J]. CAMP 2005: Seventh International Workshop on Computer Architecture for Machine Perception , Proceedings, 2005, : 198 - 203
[2] AUTOCALIBRATION CEMENTS 16-BIT PERFORMANCE
CROTEAU, J
KERTH, D
WELLAND, D
[J]. ELECTRONIC DESIGN, 1986, 34 (20) : 101 - &
[3] A 16-BIT X 16-BIT PIPELINED MULTIPLIER MACROCELL
HENLIN, DA
FERTSCH, MT
MAZIN, M
LEWIS, ET
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1985, 20 (02) : 542 - 547
[4] A 16-Bit by 16-Bit MAC Design Using Fast 5:3 Compressor Cells
Ohsang Kwon
Kevin Nowka
Earl E. Swartzlander
[J]. Journal of VLSI signal processing systems for signal, image and video technology, 2002, 31 : 77 - 89
[5] A 16-bit x 16-bit MAC design using fast 5:2 compressors
Kwon, O
Nowka, K
Swartzlander, EE
[J]. IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, PROCEEDINGS, 2000, : 235 - 243
[6] A 16-bit by 16-bit MAC design using fast 5:3 compressor cells
Kwon, O
Nowka, K
Swartzlander, EE
[J]. JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2002, 31 (02): : 77 - 89
[7] 8-BIT MICROPROCESSOR HARBORS 16-BIT PERFORMANCE
THOMAE, IH
[J]. ELECTRONICS, 1980, 53 (01): : 163 - 167
[8] 16-bit DAC extends ac performance
[J]. EDN, 1994, 39 (20)
[9] High Performance 16-Bit MCML Multiplier
Delican, Yavuz
Morgul, Avni
[J]. 2009 EUROPEAN CONFERENCE ON CIRCUIT THEORY AND DESIGN, VOLS 1 AND 2, 2009, : 157 - +
[10] 16-BIT MICROPROCESSORS
DARAGO, J
[J]. ELECTRONIC PRODUCTS MAGAZINE, 1978, 21 (02): : 24 - 31

← 1 2 3 4 5 →