HDSuper: Algorithm-Hardware Co-design for Light-weight High-quality Super-Resolution Accelerator

被引：1

作者：

Chang, Liang ^{[1
]}

Zhao, Xin ^{[1
]}

Fan, Dongqi ^{[1
]}

Hu, Zhicheng ^{[1
]}

Zhou, Jun ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Chengdu 611731, Peoples R China

来源：

2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC | 2023年

基金：

中国国家自然科学基金;

关键词：

Super-Resolution; Co-design; Efficient Mapping; High-quality Image; FPGA; NEURAL-NETWORK; FPGA;

D O I：

10.1109/DAC56929.2023.10247683

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Super-resolution (SR) networks have been gradually applied to embedded devices with good-quality image reconstruction. However, the hardware performance and power efficiency are limited by a large number of algorithm parameters, computation complexity, and hardware resources, obstructing the development of a high-quality SR accelerator. This paper proposes an end-to-end platform with a lightweight super-resolution network (LSR) and an efficient, high-quality super-resolution architecture HDSuper, to perform algorithm-hardware co-design for the SR accelerator. For algorithm design, we employ depth-wise separable convolution and pixelshuffle to reduce network size and computation complexity by considering the hardware constraints. For hardware design, we provide a unified computing core (UCC) combined with an efficient flattening-and-allocation (F-A) mapping strategy to support various operators with high computational utilization. We adopt the patch training method to reduce the external memory access of the hardware architecture. Based on the evaluation, the proposed algorithm achieves high-quality image reconstruction with 37.44dB PSNR. Finally, we implement the image reconstruction in FPGA demonstration, achieving high-quality image reconstruction with 2.08W power consumption under the lowest hardware resources compared to the state-of-the-art works.

引用

页数：6

共 50 条

[1] HDSuper: High-Quality and High Computational Utilization Edge Super-Resolution Accelerator With Hardware-Algorithm Co-Design Techniques
Zhao, Xin
Chang, Liang
Fan, Dongqi
Hu, Zhicheng
Yue, Ting
Tu, Fengbin
Zhou, Jun
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (04) : 1679 - 1692
[2] Algorithm-hardware Co-design for Deformable Convolution
Huang, Qijing
Wang, Dequan
Gao, Yizhao
Cai, Yaohui
Dong, Zhen
Wu, Bichen
Keutzer, Kurt
Wawrzynek, John
[J]. FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 48 - 51
[3] LSR: A LIGHT-WEIGHT SUPER-RESOLUTION METHOD
Wang, Wei
Lei, Xuejing
Chen, Yueru
Lee, Ming-Sui
Kuo, C. -C. Jay
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1955 - 1959
[4] Algorithm-hardware Co-design of Attention Mechanism on FPGA Devices
Zhang, Xinyi
Wu, Yawen
Zhou, Peipei
Tang, Xulong
Hu, Jingtong
[J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)
[5] Toolflow for the algorithm-hardware co-design of memristive ANN accelerators
Wabnitz, Malte
Gemmeke, Tobias
[J]. Memories - Materials, Devices, Circuits and Systems, 2023, 5
[6] Algorithm-Hardware Co-design for BQSR Acceleration in Genome Analysis ToolKit
Lo, Michael
Fang, Zhenman
Wang, Jie
Zhou, Peipei
Chang, Mau-Chung Frank
Cong, Jason
[J]. 28TH IEEE INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2020, : 157 - 166
[7] Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs
Yang, Yifan
Huang, Qijing
Wu, Bichen
Zhang, Tianjun
Ma, Liang
Gambardella, Giulio
Blott, Michaela
Lavagno, Luciano
Vissers, Kees
Wawrzynek, John
Keutzer, Kurt
[J]. PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, : 23 - 32
[8] Algorithm and Hardware Co-design for Reconfigurable CNN Accelerator
Fan, Hongxiang
Ferianc, Martin
Que, Zhiqiang
Li, He
Liu, Shuanglong
Niu, Xinyu
Luk, Wayne
[J]. 27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 250 - 255
[9] Algorithm-hardware co-design of ultra-high radix based high throughput modular multiplier
Xiao, Hao
Liu, Yuxuan
Li, Zhenmin
Liu, Guangzhu
[J]. IEICE ELECTRONICS EXPRESS, 2021, 18 (10):
[10] High Throughput FPGA-Based Object Detection via Algorithm-Hardware Co-Design
Anupreetham, Anupreetham
Ibrahim, Mohamed
Hall, Mathew
Boutros, Andrew
Kuzhively, Ajay
Mohanty, Abinash
Nurvitadhi, Eriko
Betz, Vaughn
Cao, Yu
Seo, Jae-Sun
[J]. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)

← 1 2 3 4 5 →