Improving Energy Efficiency of CGRAs with Low-Overhead Fine-Grained Power Domains

被引:1
|
作者
Nayak, Ankita [1 ]
Zhang, Keyi [1 ]
Setaluri, Rajsekhar [1 ]
Carsello, Alex [1 ]
Mann, Makai [1 ]
Torng, Christopher [1 ]
Richardson, Stephen [1 ]
Bahr, Rick [1 ]
Hanrahan, Pat [1 ]
Horowitz, Mark [1 ]
Raina, Priyanka [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
关键词
Reconfigurable computing; coarse-grained reconfigurable arrays; power domains; hardware generators; ACCELERATOR;
D O I
10.1145/3558394
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To effectively minimize static power for a wide range of applications, power domains for coarse-grained reconfigurable array (CGRA) architectures need to be more fine-grained than those found in a typical applicationspecific integrated circuit. However, the special isolation logic needed to ensure electrical protection between off and on domains makes fine-grained power domains area- and timing-inefficient. We propose a novel design of the CGRA routing fabric that reduces the area overhead of power domain boundary protection from around 9% to less than 1% without incurring any extra timing delay from the isolation cells. Conventional Unified Power Format based flow for power domain boundary protection does not support this design choice. Therefore, we create our own compiler-like passes that iteratively introduce the needed design changes, and formally verify the transformations using methods based on satisfiability modulo theories. These passes also let us optimize how we handle test and debug signals through the off tiles in the CGRA. Using our framework, we add power domains to a CGRA that we designed and taped out. The CGRA has 32x16 processing element and memory tiles and 4-MB secondary memory. We address the implementation challenges encountered due to the introduction of fine-grained power domains, including the addressing of the CGRA tiles, the power grid design, well substrate connections, and distribution of global signals. Our CGRA achieves up to 83% reduction in leakage power and 26% reduction in total power versus an identical CGRA without multiple power domains, for a range of image processing and machine learning applications.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] A Framework for Adding Low-Overhead, Fine-Grained Power Domains to CGRAs
    Nayak, Ankita
    Zhang, Keyi
    Setaluri, Raj
    Carsello, Alex
    Mann, Makai
    Richardson, Stephen
    Bahr, Rick
    Hanrahan, Pat
    Horowitz, Mark
    Raina, Priyanka
    [J]. PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 846 - 851
  • [2] GMProf: A Low-Overhead, Fine-Grained Profiling Approach for GPU Programs
    Zheng, Mai
    Ravi, Vignesh T.
    Ma, Wenjing
    Qin, Feng
    Agrawal, Gagan
    [J]. 2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,
  • [3] A Low-overhead Scheduling Methodology for Fine-grained Acceleration of Signal Processing Systems
    Boutellier, Jani
    Bhattacharyya, Shuvra S.
    Silven, Olli
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2010, 60 (03): : 333 - 343
  • [4] A Low-overhead Scheduling Methodology for Fine-grained Acceleration of Signal Processing Systems
    Jani Boutellier
    Shuvra S. Bhattacharyya
    Olli Silvén
    [J]. Journal of Signal Processing Systems, 2010, 60 : 333 - 343
  • [5] Nonlinear Code-Based Low-Overhead Fine-Grained Control Flow Checking
    Dar, Gilad
    Di Natale, Giorgio
    Keren, Osnat
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (03) : 658 - 669
  • [6] Low-overhead run-time scheduling for fine-grained acceleration of signal processing systems
    Boutellier, Jani
    Bhattacharyya, Shuvra S.
    Silven, Olli
    [J]. 2007 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, VOLS 1 AND 2, 2007, : 457 - +
  • [7] Fine-Grained FSMD Power Gating Considering Power Overhead
    Shin, Chi-Hoon
    Oh, Myeong-Hoon
    Sim, Jae-Woo
    Jeong, Jae-Chan
    Kim, Seong Woon
    [J]. ETRI JOURNAL, 2011, 33 (03) : 466 - 469
  • [8] Research on Low-overhead Dual-output XOR Gate True Random Number Generator Utilizing Fine-grained Sampling
    Yao L.
    Huang Z.
    Liang H.
    Lu Y.
    [J]. Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2023, 45 (09): : 3295 - 3301
  • [9] Performance modeling for MPI applications with low overhead fine-grained profiling
    Lu, Gangzhao
    Zhang, Weizhe
    He, Hui
    Yang, Laurence T.
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 90 : 317 - 326
  • [10] Disjointness Domains for Fine-Grained Aliasing
    Brandauer, Stephan
    Clarke, Dave
    Wrigstad, Tobias
    [J]. ACM SIGPLAN NOTICES, 2015, 50 (10) : 898 - 916