Exploiting Parallelism of Imperfect Nested Loops on Coarse-Grained Reconfigurable Architectures

被引:7
|
作者
Yin, Shouyi [1 ]
Lin, Xinhan [1 ]
Liu, Leibo [2 ]
Wei, Shaojun [1 ]
机构
[1] Tsinghua Univ, Inst Microelect, Beijing, Peoples R China
[2] Tsinghua Univ, Inst Microelect, Natl Lab Informat Sci & Technol, Beijing, Peoples R China
关键词
CGRA; software pipelining; imperfect nested loop; sibling inner loops; outer-level pipelining; kernel compression;
D O I
10.1109/TPDS.2016.2531678
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Coarse-grained reconfigurable architecture (CGRA) is a promising parallel computing platform that provides high performance, high power efficiency and flexibility. However, for imperfect nested loops, the existing loop mapping methods often result in low execution performance and poor hardware utilization. To tackle this problem, this paper makes three contributions: 1) a highly effective and general approach to map imperfect loops on CGRA; 2) a global optimization strategy to search the optimal initiation intervals (IIs); 3) a powerful kernel compression method to reduce the oversized kernel. Experiment results show that our approach can reduce the total computing latency by 20.5, 58.5 and 73.2 percent compared to the state-of-the-art approaches on 2 x 2, 4 x 4 and 8 x 8 CGRA respectively. Moreover, the compilation time and configuration context size is acceptable in practice.
引用
收藏
页码:3199 / 3213
页数:15
相关论文
共 50 条
  • [21] DRESC: A retargetable compiler for coarse-grained reconfigurable architectures
    Mei, BF
    Vernalde, S
    Verkest, D
    De Man, H
    Lauwereins, R
    2002 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), PROCEEDINGS, 2002, : 166 - 173
  • [22] Evaluating memory architectures for media applications on Coarse-grained Reconfigurable Architectures
    Lee, Jong Eun
    Choi, Kiyoung
    Dutt, Nikil
    INTERNATIONAL JOURNAL OF EMBEDDED SYSTEMS, 2008, 3 (03) : 119 - 127
  • [23] Evaluating memory architectures for media applications on coarse-grained reconfigurable architectures
    Lee, JE
    Choi, K
    Dutt, ND
    IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, PROCEEDINGS, 2003, : 172 - 182
  • [24] Exploiting Coarse-grained Parallelism in B plus Tree Searches on an APU
    Daga, Mayank
    Nutter, Mark
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 240 - 247
  • [25] Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
    Gordon, Michael I.
    Thies, William
    Amarasinghe, Saman
    ACM SIGPLAN NOTICES, 2006, 41 (11) : 151 - 162
  • [26] A practical approach to exploiting coarse-grained pipeline parallelism in C programs
    Thies, William
    Chandrasekhar, Vilcrarn
    Amarasinghe, Saman
    MICRO-40: PROCEEDINGS OF THE 40TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2007, : 356 - +
  • [27] Still image processing on coarse-grained reconfigurable array architectures
    Hartmann, Matthias
    Pantazis, Vassilis
    Aa, Tom Vander
    Berekovic, Mladen
    Hochberger, Christian
    de Sutter, Bjorn
    2007 IEEE/ACM/IFIP WORKSHOP ON EMBEDDED SYSTEMS FOR REAL-TIME MULTIMEDIA, 2007, : 67 - +
  • [28] A spatial mapping algorithm for heterogeneous coarse-grained reconfigurable architectures
    Ahn, Minwook
    Yoon, Jonghee W.
    Paek, Yunheung
    Kim, Yoonjin
    Kiemb, Mary
    Choi, Kiyoung
    2006 DESIGN AUTOMATION AND TEST IN EUROPE, VOLS 1-3, PROCEEDINGS, 2006, : 361 - +
  • [29] Selective Validations for Efficient Protections on Coarse-Grained Reconfigurable Architectures
    Kang, Jihoon
    Ko, Yohan
    Lee, Jongwon
    Kim, Yongjoo
    So, Hwisoo
    Lee, Kyoungwoo
    Paek, Yunheung
    PROCEEDINGS OF THE 2013 IEEE 24TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 13), 2013, : 95 - 98
  • [30] Efficient Execution of Stream Graphs on Coarse-Grained Reconfigurable Architectures
    Oh, Sangyun
    Lee, Hongsik
    Lee, Jongeun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (12) : 1978 - 1988