An Energy-Efficient GAN Accelerator With On-Chip Training for Domain-Specific Optimization

被引:3
|
作者
Kim, Soyeon [1 ]
Kang, Sanghoon [1 ]
Han, Donghyeon [1 ]
Kim, Sangjin [1 ]
Kim, Sangyeob [1 ]
Yoo, Hoi-Jun [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
关键词
Deep learning; generative adversarial network (GAN); instance normalization (IN); local learning;
D O I
10.1109/JSSC.2021.3094469
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Generative adversarial networks (GANs) consist of multiple deep neural networks cooperating and competing with each other. Due to their complex architectures and large feature map sizes, training GANs requires a huge amount of computations. Moreover, instance normalization (IN) layers in GANs dramatically increase the external memory access (EMA). However, retraining GANs with user-specific data is critical on mobile devices because the pre-trained model outputs distorted images under user-specific conditions. This article proposes a GAN training accelerator to enable energy-efficient domain-specific optimization of GAN with user's local data. Selective layer retraining (SELRET) picks out layers that are effective in enhancing the quality of the retrained model. Without image quality degradation, the SELRET reduces the required computation by 69%. Moreover, reordering layers for instance normalization (ROLIN) is proposed to reduce the EMA of intermediate data. Through the implementation of the proposed architecture, which splits and reorders the IN layers, 38.7% and 32.2% of overall EMA reduction are achieved in the forward propagation (FP) stage and the error propagation (EP) stage, respectively. The proposed processor is fabricated in a 65-nm CMOS process, showing 0.38-TFLOPS/W energy efficiency. The chip can retrain a face modification GAN with a custom dataset of 256 x 256 images over 100 epochs under 30 s while only consuming 274 mW. Compared to the previous FPGA implementation, this work improved the retraining performance and energy efficiency by 2x and 39x, respectively. As a result, the proposed accelerator enables GAN's domain-specific optimization on a mobile platform.
引用
收藏
页码:2968 / 2980
页数:13
相关论文
共 50 条
  • [1] An Energy-Efficient GAN Accelerator with On-chip Training for Domain Specific Optimization
    Kim, Soyeon
    Kang, Sanghoon
    Han, Donghyeon
    Kim, Sangyeob
    Kim, Sangjin
    Yoo, Hoi-jun
    [J]. 2020 IEEE ASIAN SOLID-STATE CIRCUITS CONFERENCE (A-SSCC), 2020,
  • [2] Energy-Efficient Mapping of Biomedical Applications on Domain-Specific Accelerator under Process Variation
    Tavana, Mohammad Khavari
    Kulkarni, Amey
    Rahimi, Abbas
    Mohsenin, Tinoosh
    Homayoun, Houman
    [J]. PROCEEDINGS OF THE 2014 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2014, : 275 - 278
  • [3] An Energy-Efficient Domain-Specific Architecture for Regular Expressions
    Conficconi, Davide
    del Sozzo, Emanuele
    Carloni, Filippo
    Comodi, Alessandro
    Scolari, Alberto
    Santambrogio, Marco Domenico
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (01) : 3 - 17
  • [4] Exploring Domain-Specific Architectures for Energy-Efficient Wearable Computing
    Gajaria, Dhruv
    Adegbija, Tosiron
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2022, 94 (06): : 559 - 577
  • [5] Exploring Domain-Specific Architectures for Energy-Efficient Wearable Computing
    Dhruv Gajaria
    Tosiron Adegbija
    [J]. Journal of Signal Processing Systems, 2022, 94 : 559 - 577
  • [6] An Energy-Efficient Computing-in-Memory Neuromorphic System with On-Chip Training
    Zhao, Zhao
    Wang, Yuan
    Zhang, Xinyue
    Cui, Xiaoxin
    Huang, Ru
    [J]. 2019 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS 2019), 2019,
  • [7] A Domain-Specific Processor Microarchitecture for Energy-Efficient, Dynamic IoT Communication
    Muzaffar, Shahzad
    Elfadel, Ibrahim M.
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (09) : 2074 - 2087
  • [8] An Energy-Efficient and Area-Efficient Depthwise Separable Convolution Accelerator with Minimal On-Chip Memory Access
    Chen, Yi
    Lou, Jie
    Lanius, Christian
    Freye, Florian
    Loh, Johnson
    Gemmeke, Tobias
    [J]. 2023 IFIP/IEEE 31ST INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION, VLSI-SOC, 2023, : 50 - 55
  • [9] Energy-Efficient On-Chip Training for Customized Home-based Rehabilitation Systems
    Goksoy, A. Alper
    An, Sizhe
    Ogras, Umit Y.
    [J]. 2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [10] An Energy-Efficient Visual Object Tracking Processor Exploiting Domain-Specific Features
    Gong, Yuchuan
    Guo, Hongtao
    Liu, Xiyuan
    Zheng, Jingxiao
    Zhang, Teng
    Que, Luying
    Jia, Conghan
    Ou, Guangbin
    Jiao, Xiben
    Liu, Zherong
    Chang, Liang
    Zhou, Liang
    Zhou, Jun
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (05) : 2794 - 2798