Exploiting Parallelism of Imperfect Nested Loops on Coarse-Grained Reconfigurable Architectures

被引:7
|
作者
Yin, Shouyi [1 ]
Lin, Xinhan [1 ]
Liu, Leibo [2 ]
Wei, Shaojun [1 ]
机构
[1] Tsinghua Univ, Inst Microelect, Beijing, Peoples R China
[2] Tsinghua Univ, Inst Microelect, Natl Lab Informat Sci & Technol, Beijing, Peoples R China
关键词
CGRA; software pipelining; imperfect nested loop; sibling inner loops; outer-level pipelining; kernel compression;
D O I
10.1109/TPDS.2016.2531678
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Coarse-grained reconfigurable architecture (CGRA) is a promising parallel computing platform that provides high performance, high power efficiency and flexibility. However, for imperfect nested loops, the existing loop mapping methods often result in low execution performance and poor hardware utilization. To tackle this problem, this paper makes three contributions: 1) a highly effective and general approach to map imperfect loops on CGRA; 2) a global optimization strategy to search the optimal initiation intervals (IIs); 3) a powerful kernel compression method to reduce the oversized kernel. Experiment results show that our approach can reduce the total computing latency by 20.5, 58.5 and 73.2 percent compared to the state-of-the-art approaches on 2 x 2, 4 x 4 and 8 x 8 CGRA respectively. Moreover, the compilation time and configuration context size is acceptable in practice.
引用
收藏
页码:3199 / 3213
页数:15
相关论文
共 50 条
  • [41] High-level Programming of Coarse-Grained Reconfigurable Architectures
    Zain-ul-Abdin
    FPL: 2009 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, 2009, : 713 - 714
  • [42] A Configuration Data Multicasting Method for Coarse-Grained Reconfigurable Architectures
    Kojima, Takuya
    Amano, Hideharu
    2018 28TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2018, : 239 - 242
  • [43] Evolution in architectures and programming methodologies of coarse-grained reconfigurable computing
    Zain-ul-Abdin
    Svensson, Bertil
    MICROPROCESSORS AND MICROSYSTEMS, 2009, 33 (03) : 161 - 178
  • [44] A Fine-Grained Multicasting of Configuration Data for Coarse-Grained Reconfigurable Architectures
    Kojima, Takuya
    Amano, Hideharu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (07): : 1247 - 1256
  • [45] Exploiting coarse-grained parallelism to accelerate protein motif finding with a network processor
    Ben, W
    Buhler, J
    Crowley, P
    PACT 2005: 14TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2005, : 173 - 184
  • [46] Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach
    Jacob, Arpith C.
    Nair, Ravi
    Eichenberger, Alexandre E.
    Antao, Samuel F.
    Bertolli, Carlo
    Chen, Tong
    Sura, Zehra
    O'Brien, Kevin
    Wong, Michael
    OPENMP: HETEROGENOUS EXECUTION AND DATA MOVEMENTS, IWOMP 2015, 2015, 9342 : 30 - 41
  • [47] Exploiting Coarse-Grained Parallelism in Multi-Transform Architectures for H.264/AVC High Profile Codecs
    Dias, Tiago
    Roma, Nuno
    Sousa, Leonel
    CONFERENCE ON ELECTRONICS, TELECOMMUNICATIONS AND COMPUTERS - CETC 2013, 2014, 17 : 154 - 161
  • [48] Area and delay estimation for FPGA implementation of coarse-grained reconfigurable architectures
    Yan, Leipo
    Srikanthan, Thambipillai
    Gang, Niu
    ACM SIGPLAN NOTICES, 2006, 41 (07) : 182 - 188
  • [49] A Template-based Framework for Exploring Coarse-Grained Reconfigurable Architectures
    Podobas, Artur
    Sano, Kentaro
    Matsuoka, Satoshi
    2020 IEEE 31ST INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2020), 2020, : 1 - 8
  • [50] Edge-centric Modulo Scheduling for Coarse-Grained Reconfigurable Architectures
    Park, Hyunchul
    Fan, Kevin
    Mahlke, Scott
    Oh, Taewook
    Kim, Heeseok
    Kim, Hong-seok
    PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, : 166 - 176