An Energy-Efficient GAN Accelerator With On-Chip Training for Domain-Specific Optimization

被引:3
|
作者
Kim, Soyeon [1 ]
Kang, Sanghoon [1 ]
Han, Donghyeon [1 ]
Kim, Sangjin [1 ]
Kim, Sangyeob [1 ]
Yoo, Hoi-Jun [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
关键词
Deep learning; generative adversarial network (GAN); instance normalization (IN); local learning;
D O I
10.1109/JSSC.2021.3094469
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Generative adversarial networks (GANs) consist of multiple deep neural networks cooperating and competing with each other. Due to their complex architectures and large feature map sizes, training GANs requires a huge amount of computations. Moreover, instance normalization (IN) layers in GANs dramatically increase the external memory access (EMA). However, retraining GANs with user-specific data is critical on mobile devices because the pre-trained model outputs distorted images under user-specific conditions. This article proposes a GAN training accelerator to enable energy-efficient domain-specific optimization of GAN with user's local data. Selective layer retraining (SELRET) picks out layers that are effective in enhancing the quality of the retrained model. Without image quality degradation, the SELRET reduces the required computation by 69%. Moreover, reordering layers for instance normalization (ROLIN) is proposed to reduce the EMA of intermediate data. Through the implementation of the proposed architecture, which splits and reorders the IN layers, 38.7% and 32.2% of overall EMA reduction are achieved in the forward propagation (FP) stage and the error propagation (EP) stage, respectively. The proposed processor is fabricated in a 65-nm CMOS process, showing 0.38-TFLOPS/W energy efficiency. The chip can retrain a face modification GAN with a custom dataset of 256 x 256 images over 100 epochs under 30 s while only consuming 274 mW. Compared to the previous FPGA implementation, this work improved the retraining performance and energy efficiency by 2x and 39x, respectively. As a result, the proposed accelerator enables GAN's domain-specific optimization on a mobile platform.
引用
收藏
页码:2968 / 2980
页数:13
相关论文
共 50 条
  • [41] Hybrid, Asymmetric and Reconfigurable Input Unit Designs for Energy-Efficient On-Chip Networks
    Liu, Xiaoman
    Gao, Yujie
    He, Yuan
    Yue, Xiaohan
    Jiang, Haiyan
    Wang, Xibo
    [J]. IEICE TRANSACTIONS ON ELECTRONICS, 2023, E106C (10) : 570 - 579
  • [42] An Energy-Efficient Virtual Channel Power-Gating Mechanism for On-Chip Networks
    Mirhosseini, Amirhossein
    Sadrosadati, Mohammad
    Fakhrzadehgan, Ali
    Modarressi, Mehdi
    Sarbazi-Azad, Hamid
    [J]. 2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2015, : 1527 - 1532
  • [43] Designing Energy-Efficient Low-Diameter On-chip Networks with Equalized Interconnects
    Joshi, Ajay
    Kim, Byungsub
    Stojanovic, Vladimir
    [J]. 2009 17TH IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS (HOTI 2009), 2009, : 3 - 12
  • [44] Local Traffic-Based Energy-Efficient Hybrid Switching for On-Chip Networks
    He, Yuan
    Jiao, Jinyu
    Kondo, Masaaki
    [J]. 2021 29TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2021), 2021, : 198 - 206
  • [45] A case for hierarchical rings with deflection routing: An energy-efficient on-chip communication substrate
    Ausavarungnirun, Rachata
    Fallin, Chris
    Yu, Xiangyao
    Chang, Kevin Kai-Wei
    Nazario, Greg
    Das, Reetuparna
    Loh, Gabriel H.
    Mutlu, Onur
    [J]. PARALLEL COMPUTING, 2016, 54 : 29 - 45
  • [46] Self-Calibrated Energy-Efficient and Reliable Channels for On-Chip Interconnection Networks
    Huang, Po-Tsang
    Hwang, Wei
    [J]. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2012, 2012
  • [47] INDENT: Incremental Online Decision Tree Training for Domain-Specific Systems-on-Chip
    Krishnakumar, Anish
    Marculescu, Radu
    Ogras, Umit
    [J]. 2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
  • [48] Guaranteed optimization for domain-specific programming
    Veldhuizen, TL
    [J]. DOMAIN-SPECIFIC PROGRAM GENERATION, 2003, 3016 : 307 - 324
  • [49] Domain-Specific Quantum Architecture Optimization
    Lin, Wan-Hsuan
    Tan, Bochen
    Niu, Murphy Yuezhen
    Kimko, Jason
    Cong, Jason
    [J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2022, 12 (03) : 624 - 637
  • [50] Domain-specific optimization in automata learning
    Hungar, H
    Niese, O
    Steffen, B
    [J]. COMPUTER AIDED VERIFICATION, 2003, 2725 : 315 - 327