A Unified Hardware Architecture for Convolutions and Deconvolutions in CNN

被引:0
|
作者
Bai, Lin [1 ]
Lyu, Yecheng [1 ]
Huang, Xinming [1 ]
机构
[1] Worcester Polytech Inst, Worcester, MA 01609 USA
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deconvolution plays an important role in the state-of-the-art convolutional neural networks (CNNs) for the tasks like semantic segmentation, image super resolution, etc. In this paper, a scalable neural network hardware architecture for image segmentation is proposed. By sharing the same computing resources, both convolution and deconvolution operations are handled by the same process element array. In addition, access to on-chip and off-chip memories is optimized to alleviate the burden introduced by partial sum. As an example, SegNet-Basic has been implemented using the proposed unified architecture by targeting on Xilinx ZC706 FPGA, which achieves the performance of 151.5 GOPS and 94.3 GOPS for convolution and deconvolution respectively. This unified convolution/deconvolution design is applicable to other CNNs with deconvolution.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Scalable and Unified Hardware Architecture for Montgomery Inversion Computation in GF(p) and GF(2n)
    Yang Xiao-hui
    Qin Fan
    Dai Zi-bin
    Zhang Yong-fu
    2009 IEEE 8TH INTERNATIONAL CONFERENCE ON ASIC, VOLS 1 AND 2, PROCEEDINGS, 2009, : 843 - +
  • [42] Compact and unified hardware architecture for SHA-1 and SHA-256 of trusted mobile computing
    Kim, Mooseop
    Lee, Deok Gyu
    Ryou, Jaecheol
    PERSONAL AND UBIQUITOUS COMPUTING, 2013, 17 (05) : 921 - 932
  • [43] Compact and unified hardware architecture for SHA-1 and SHA-256 of trusted mobile computing
    Mooseop Kim
    Deok Gyu Lee
    Jaecheol Ryou
    Personal and Ubiquitous Computing, 2013, 17 : 921 - 932
  • [44] A Parallel Implementation of a Smoothed Particle Hydrodynamics Method on Graphics Hardware Using the Compute Unified Device Architecture
    Wong, Un-Hong
    Wong, Hon-Cheng
    Tang, Zesheng
    ISCM II AND EPMESC XII, PTS 1 AND 2, 2010, 1233 : 395 - 400
  • [45] Implementation of TFT Inspection System using the Common Unified Device Architecture (CUDA) on Modern Graphics Hardware
    Lee, Chang Hee
    Jeong, Changki
    Chang, Moonsoo
    Park, PooGyeon
    2008 10TH INTERNATIONAL CONFERENCE ON CONTROL AUTOMATION ROBOTICS & VISION: ICARV 2008, VOLS 1-4, 2008, : 1899 - 1902
  • [46] EDS HARDWARE ARCHITECTURE
    WARD, M
    TOWNSEND, P
    WATZLAWIK, G
    LECTURE NOTES IN COMPUTER SCIENCE, 1990, 457 : 816 - 827
  • [47] MINICOMPUTER HARDWARE ARCHITECTURE
    LEIS, CT
    PROCEEDINGS OF THE IEEE, 1973, 61 (11) : 1535 - 1538
  • [48] Data Locality Optimization of Depthwise Separable Convolutions for CNN Inference Accelerators
    Wu, Hao-Ning
    Huang, Chih-Tsun
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 120 - 125
  • [49] OPC Unified Architecture
    Enste, Udo
    Mahnke, Wolfgang
    AT-AUTOMATISIERUNGSTECHNIK, 2011, 59 (07) : 397 - 404
  • [50] Low-Complexity Classification Technique and Hardware-Efficient Classify-Unit Architecture for CNN Accelerator
    Islam, Md Najrul
    Shrestha, Rahul
    Chowdhury, Shubhajit Roy
    PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 210 - 215