Enhancing the performance of 16-bit code using augmenting instructions

被引:3
|
作者
Krishnaswamy, A [1 ]
Gupta, R [1 ]
机构
[1] Univ Arizona, Dept Comp Sci, Tucson, AZ 85721 USA
关键词
algorithms; measurement; performance; embedded processor; 32-bit ARM ISA; 16-bit Thumb ISA; code size; AX instructions; instruction coalescing;
D O I
10.1145/780731.780767
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the embedded domain, memory usage and energy consumption are critical constraints. Dual width instruction set embedded processors such as the ARM provide a 16-bit instruction set in addition to the 32-bit instruction set to address these concerns. Using 16-bit instructions one can achieve code size reduction and I-cache energy savings at the cost of performance. We have observed that throughout 16-bit Thumb code there exist Thumb instruction pairs that are equivalent to a single ARM instruction. We have developed an approach which uses combination of compiler and architectural support to exploit the above property for improving performance of 16-bit code. We enhance the Thumb instruction set by incorporating Augmenting eXtensions (AX). The task of the compiler is to identify pairs of Thumb instructions that can be safely combined and executed as-single ARM instructions. The compiler replaces such pairs of Thumb, instructions by AX+Thumb instruction pairs. The AX instruction is coalesced with the immediately following Thumb instruction to generate a single ARM instruction at decode time. Thus, using AX instructions, the compiler can both generate compact 16-bit code and provide hardware with information needed to produce better performing 32-bit code.
引用
收藏
页码:254 / 264
页数:11
相关论文
共 50 条
  • [1] 16-bit floating point instructions for embedded multimedia applications
    Lacassagne, L
    Etiemble, D
    Kablia, SAO
    [J]. CAMP 2005: Seventh International Workshop on Computer Architecture for Machine Perception , Proceedings, 2005, : 198 - 203
  • [2] AUTOCALIBRATION CEMENTS 16-BIT PERFORMANCE
    CROTEAU, J
    KERTH, D
    WELLAND, D
    [J]. ELECTRONIC DESIGN, 1986, 34 (20) : 101 - &
  • [3] A 16-BIT X 16-BIT PIPELINED MULTIPLIER MACROCELL
    HENLIN, DA
    FERTSCH, MT
    MAZIN, M
    LEWIS, ET
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1985, 20 (02) : 542 - 547
  • [4] A 16-Bit by 16-Bit MAC Design Using Fast 5:3 Compressor Cells
    Ohsang Kwon
    Kevin Nowka
    Earl E. Swartzlander
    [J]. Journal of VLSI signal processing systems for signal, image and video technology, 2002, 31 : 77 - 89
  • [5] A 16-bit x 16-bit MAC design using fast 5:2 compressors
    Kwon, O
    Nowka, K
    Swartzlander, EE
    [J]. IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, PROCEEDINGS, 2000, : 235 - 243
  • [6] A 16-bit by 16-bit MAC design using fast 5:3 compressor cells
    Kwon, O
    Nowka, K
    Swartzlander, EE
    [J]. JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2002, 31 (02): : 77 - 89
  • [7] 8-BIT MICROPROCESSOR HARBORS 16-BIT PERFORMANCE
    THOMAE, IH
    [J]. ELECTRONICS, 1980, 53 (01): : 163 - 167
  • [9] High Performance 16-Bit MCML Multiplier
    Delican, Yavuz
    Morgul, Avni
    [J]. 2009 EUROPEAN CONFERENCE ON CIRCUIT THEORY AND DESIGN, VOLS 1 AND 2, 2009, : 157 - +
  • [10] 16-BIT MICROPROCESSORS
    DARAGO, J
    [J]. ELECTRONIC PRODUCTS MAGAZINE, 1978, 21 (02): : 24 - 31