Multiloop Parallelisation Using Unrolling and Fission

被引:3
|
作者
Lam, Yuet Ming [1 ]
Coutinho, Jose Gabriel F. [2 ]
Ho, Chun Hok [2 ]
Leong, Philip Heng Wai [3 ]
Luk, Wayne [2 ]
机构
[1] Macau Univ Sci & Technol, Fac Informat Technol, Taipa, Peoples R China
[2] Imperial Coll London, Dept Comp, London, England
[3] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW, Australia
基金
英国工程与自然科学研究理事会;
关键词
All Open Access; Gold; Green;
D O I
10.1155/2010/475620
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A technique for parallelising multiple loops in a heterogeneous computing system is presented. Loops are first unrolled and then broken up into multiple tasks which are mapped to reconfigurable hardware. A performance-driven optimisation is applied to find the best unrolling factor for each loop under hardware size constraints. The approach is demonstrated using three applications: speech recognition, image processing, and the N-Body problem. Experimental results show that a maximum speedup of 34 is achieved on a 274 MHz FPGA for the N-Body over a 2.6GHz microprocessor, which is 4.1 times higher than that of an approach without unrolling.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Memory bank disambiguation using modulo unrolling for Raw machines
    Barua, R
    Lee, W
    Amarasinghe, S
    Agarwal, A
    FIFTH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 1998, : 212 - 220
  • [22] Using optimistic execution techniques as a parallelisation tool for general purpose computing
    Back, A
    Turner, S
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1995, 919 : 21 - 26
  • [23] VIRTUAL UNROLLING USING X-RAY COMPUTED TOMOGRAPHY
    Allegra, D.
    Ciliberto, E.
    Ciliberto, P.
    Milotta, F. L. M.
    Petrillo, G.
    Stanco, F.
    Trombatore, C.
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2864 - 2868
  • [24] Graph Signal Restoration Using Nested Deep Algorithm Unrolling
    Nagahama, Masatoshi
    Yamada, Koki
    Tanaka, Yuichi
    Chan, Stanley H.
    Eldar, Yonina C.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2022, 70 : 3296 - 3311
  • [25] Unsupervised Volumetric Displacement Fields Using Cost Function Unrolling
    Lifshitz, Gal
    Raviv, Dan
    BIOMEDICAL IMAGE REGISTRATION, DOMAIN GENERALISATION AND OUT-OF-DISTRIBUTION ANALYSIS, 2022, 13166 : 153 - 160
  • [26] Parallelisation of nonequilibrium molecular dynamics code for polymer melts using OpenMP
    Zhou, ZW
    Todd, BD
    Daivis, PJ
    COMPUTATIONAL SCIENCE - ICCS 2003, PT III, PROCEEDINGS, 2003, 2659 : 275 - 285
  • [27] Efficient parallelisation of Metropolis-Hastings algorithms using a prefetching approach
    Strid, Ingvar
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (11) : 2814 - 2835
  • [28] Efficient strategy for parallelisation of multilevel fast multipole algorithm using CUDA
    Garcia, Eliseo
    Delgado, Carlos
    Lozano, Lorena
    Catedra, Felipe
    IET MICROWAVES ANTENNAS & PROPAGATION, 2019, 13 (10) : 1554 - 1563
  • [29] Reachability Analysis for Multiloop Programs Using Transition Power Abstraction
    Britikov, Konstantin
    Blicha, Martin
    Sharygina, Natasha
    Fedyukovich, Grigory
    FORMAL METHODS, PT I, FM 2024, 2025, 14933 : 558 - 576
  • [30] Parallelisation of nonlinear structural analysis using dual partition super elements
    Jokhio, Gul A.
    Izzuddin, Bassam A.
    ADVANCES IN ENGINEERING SOFTWARE, 2013, 60-61 : 81 - 88