Scalar Processing Overhead on SIMD-Only Architectures

被引:0
|
作者
Azevedo, Arnaldo [1 ]
Juurlink, Ben [1 ]
机构
[1] Delft Univ Technol, Fac Elect Engn Math & Comp Sci, Comp Engn Grp, Delft, Netherlands
关键词
Computer architecture; Datapath; SIMD processing; SIMD overhead;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Cell processor consists of a general-purpose core and eight cores with a complete SIMD instruction set. Although originally designed for multimedia and gaming, it is currently being used for a much broader range of applications. In this paper we evaluate if the Cell SPEs could benefit significantly from a scalar processing unit using two methodologies. In the first methodology the scalar processing overhead is eliminated by replacing all scalar data types by the quadword data type. This methodology is feasible only for relatively small kernels. In the second methodology SPE performance is compared to the performance of a similarly configured PPU, which supports scalar operations. Experimental results show that the scalar processing overhead ranges from 19% to 57% for small kernels and from 12% to 39% for large kernels. Solutions to eliminate this overhead are also discussed.
引用
收藏
页码:183 / 190
页数:8
相关论文
共 50 条
  • [41] VLASPH: Smoothed Particle Hydrodynamics on VLA SIMD Architectures
    Fan, Xiaokang
    Ge, Zhen
    Long, Sifan
    Tang, Tao
    Huang, Chun
    Peng, Lin
    Yang, Canqun
    EURO-PAR 2024: PARALLEL PROCESSING, PT III, EURO-PAR 2024, 2024, 14803 : 371 - 385
  • [42] Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs
    Yilmazer, Ayse
    Chen, Zhongliang
    Kaeli, David
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [43] Cholesky factorization on SIMD multi-core architectures
    Lemaitre, Florian
    Couturier, Benjamin
    Lacassagne, Lionel
    JOURNAL OF SYSTEMS ARCHITECTURE, 2017, 79 : 1 - 15
  • [44] PIPELINING TREE-STRUCTURED ALGORITHMS ON SIMD ARCHITECTURES
    BARNARD, DT
    SKILLICORN, DB
    INFORMATION PROCESSING LETTERS, 1990, 35 (02) : 79 - 84
  • [45] A CHOLESKY UPDATING AND DOWNDATING ALGORITHM FOR SYSTOLIC AND SIMD ARCHITECTURES
    BISCHOF, CH
    PAN, CT
    TANG, PTP
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1993, 14 (03): : 670 - 676
  • [46] Influences of SIMD Architectures for Scattered Data Interpolation Algorithm
    Tournier, Jean-Charles
    Naef, Martin
    2010 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2010), 2010, : 109 - 110
  • [47] A flexible algorithm for calculating pair interactions on SIMD architectures
    Pall, Szilard
    Hess, Berk
    COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (12) : 2641 - 2650
  • [48] Performance Improvement of Multimedia Kernels by Alleviating Overhead Instructions on SIMD Devices
    Shahbahram, Asadollah
    Juurlink, Ben
    ADVANCED PARALLEL PROCESSING TECHNOLOGIES, PROCEEDINGS, 2009, 5737 : 389 - 407
  • [49] Reconfigurable SIMD units for image processing
    Aguado, David
    Revenga, Pedro
    Lazaro, Jose Luis
    Derutin, Jean Pierre
    2007 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING, CONFERENCE PROCEEDINGS BOOK, 2007, : 663 - +
  • [50] AUGMENTING ADA FOR SIMD PARALLEL PROCESSING
    CLINE, CL
    SIEGEL, HJ
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1985, 11 (09) : 970 - 977