HDSuper: Algorithm-Hardware Co-design for Light-weight High-quality Super-Resolution Accelerator

被引:1
|
作者
Chang, Liang [1 ]
Zhao, Xin [1 ]
Fan, Dongqi [1 ]
Hu, Zhicheng [1 ]
Zhou, Jun [1 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Super-Resolution; Co-design; Efficient Mapping; High-quality Image; FPGA; NEURAL-NETWORK; FPGA;
D O I
10.1109/DAC56929.2023.10247683
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Super-resolution (SR) networks have been gradually applied to embedded devices with good-quality image reconstruction. However, the hardware performance and power efficiency are limited by a large number of algorithm parameters, computation complexity, and hardware resources, obstructing the development of a high-quality SR accelerator. This paper proposes an end-to-end platform with a lightweight super-resolution network (LSR) and an efficient, high-quality super-resolution architecture HDSuper, to perform algorithm-hardware co-design for the SR accelerator. For algorithm design, we employ depth-wise separable convolution and pixelshuffle to reduce network size and computation complexity by considering the hardware constraints. For hardware design, we provide a unified computing core (UCC) combined with an efficient flattening-and-allocation (F-A) mapping strategy to support various operators with high computational utilization. We adopt the patch training method to reduce the external memory access of the hardware architecture. Based on the evaluation, the proposed algorithm achieves high-quality image reconstruction with 37.44dB PSNR. Finally, we implement the image reconstruction in FPGA demonstration, achieving high-quality image reconstruction with 2.08W power consumption under the lowest hardware resources compared to the state-of-the-art works.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] HDSuper: High-Quality and High Computational Utilization Edge Super-Resolution Accelerator With Hardware-Algorithm Co-Design Techniques
    Zhao, Xin
    Chang, Liang
    Fan, Dongqi
    Hu, Zhicheng
    Yue, Ting
    Tu, Fengbin
    Zhou, Jun
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (04) : 1679 - 1692
  • [2] Algorithm-hardware Co-design for Deformable Convolution
    Huang, Qijing
    Wang, Dequan
    Gao, Yizhao
    Cai, Yaohui
    Dong, Zhen
    Wu, Bichen
    Keutzer, Kurt
    Wawrzynek, John
    [J]. FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 48 - 51
  • [3] LSR: A LIGHT-WEIGHT SUPER-RESOLUTION METHOD
    Wang, Wei
    Lei, Xuejing
    Chen, Yueru
    Lee, Ming-Sui
    Kuo, C. -C. Jay
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1955 - 1959
  • [4] Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices
    Zhang, Xinyi
    Wu, Yawen
    Zhou, Peipei
    Tang, Xulong
    Hu, Jingtong
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
  • [5] Toolflow for the algorithm-hardware co-design of memristive ANN accelerators
    Wabnitz, Malte
    Gemmeke, Tobias
    [J]. Memories - Materials, Devices, Circuits and Systems, 2023, 5
  • [6] Algorithm-Hardware Co-design for BQSR Acceleration in Genome Analysis ToolKit
    Lo, Michael
    Fang, Zhenman
    Wang, Jie
    Zhou, Peipei
    Chang, Mau-Chung Frank
    Cong, Jason
    [J]. 28TH IEEE INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2020, : 157 - 166
  • [7] Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs
    Yang, Yifan
    Huang, Qijing
    Wu, Bichen
    Zhang, Tianjun
    Ma, Liang
    Gambardella, Giulio
    Blott, Michaela
    Lavagno, Luciano
    Vissers, Kees
    Wawrzynek, John
    Keutzer, Kurt
    [J]. PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, : 23 - 32
  • [8] Algorithm and Hardware Co-design for Reconfigurable CNN Accelerator
    Fan, Hongxiang
    Ferianc, Martin
    Que, Zhiqiang
    Li, He
    Liu, Shuanglong
    Niu, Xinyu
    Luk, Wayne
    [J]. 27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 250 - 255
  • [9] Algorithm-hardware co-design of ultra-high radix based high throughput modular multiplier
    Xiao, Hao
    Liu, Yuxuan
    Li, Zhenmin
    Liu, Guangzhu
    [J]. IEICE ELECTRONICS EXPRESS, 2021, 18 (10):
  • [10] High Throughput FPGA-Based Object Detection via Algorithm-Hardware Co-Design
    Anupreetham, Anupreetham
    Ibrahim, Mohamed
    Hall, Mathew
    Boutros, Andrew
    Kuzhively, Ajay
    Mohanty, Abinash
    Nurvitadhi, Eriko
    Betz, Vaughn
    Cao, Yu
    Seo, Jae-Sun
    [J]. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)