CANNA: Neural Network Acceleration using Configurable Approximation on GPGPU

被引:0
|
作者
Imani, Mohsen [1 ]
Masich, Max [1 ]
Peroni, Daniel [1 ]
Wang, Pushen [1 ]
Rosing, Tajana [1 ]
机构
[1] Univ Calif San Diego, CSE Dept, La Jolla, CA 92093 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Neural networks have been successfully used in many applications. Due to their computational complexityit is difficult to implement them on embedded devices. Neural networks are inherently approximate and thus can be simplified. In this paper, CANNA proposes a gradual training approximation which adaptively sets the level of hardware approximation depending on the neural network's internal error, instead of apply uniform hardware approximation. To accelerate inference, CANNA's layer-based approximation approach selectively relaxes the computation in each layer of neural network, as a function its sensitivity to approximation. For hardware support, we use a configurable floating point unit in Hardware that dynamically identifies inputs which produce the largest approximation error and process them instead in precise mode. We evaluate the accuracy and efficiency of our design by integrating configurable FPUs into AMD's Southern Island GPU architecture. Our experimental evaluation shows that CANNA achieves up to 4.84x (7.13x) energy savings and 3.22x (4.64x) speedup when training four different neural network applications with 0% (2%) quality loss as compared to the implementation on baseline GPU. During the inference phase, our layer-based approach improves the energy efficiency by 4.42x (6.06x) and results in 2.96x (3.98x) speedup while ensuring 0% (2%) quality loss.
引用
收藏
页码:682 / 689
页数:8
相关论文
共 50 条
  • [41] A Speed Regression Using Acceleration Data in a Deep Convolutional Neural Network
    Wang, Karl L.
    Xu, Jingya
    [J]. IEEE ACCESS, 2019, 7 : 9351 - 9356
  • [42] Neural network pruning and hardware acceleration
    Jeong, Taehee
    Ghasemi, Ehsam
    Tuyls, Jorn
    Delaye, Elliott
    Sirasao, Ashish
    [J]. 2020 IEEE/ACM 13TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC 2020), 2020, : 440 - 445
  • [43] An Acceleration Method by GPGPU for Analytical Placement using Quasi-Newton Method
    Kuwabara, Syota
    Kohira, Yukihide
    Takashima, Yasuhiro
    [J]. 2013 IEEE 10TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2013,
  • [44] Re-configurable parallel Feed-Forward Neural Network implementation using FPGA
    El-Sharkawy, Mohamed
    Wael, Miran
    Mashaly, Maggie
    Azab, Eman
    [J]. INTEGRATION-THE VLSI JOURNAL, 2024, 97
  • [45] Scalable network-on-chip architecture for configurable neural networks
    Vainbrand, Dmitri
    Ginosar, Ran
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2011, 35 (02) : 152 - 166
  • [46] Adaptive compensation of modeled friction using a RBF neural network approximation
    Vitiello, Valentina
    Tornambe, Antonio
    [J]. PROCEEDINGS OF THE 46TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2007, : 197 - 202
  • [47] Convergence of a Neural Network for Sparse Approximation using the Nonsmooth Lojasiewicz Inequality
    Balavoine, Aurele
    Rozell, Christopher J.
    Romberg, Justin
    [J]. 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [48] POLYGONAL-APPROXIMATION USING A COMPETITIVE HOPFIELD NEURAL-NETWORK
    CHUNG, PC
    TSAI, CT
    CHEN, EL
    SUN, YN
    [J]. PATTERN RECOGNITION, 1994, 27 (11) : 1505 - 1512
  • [49] Acceleration of Deep Neural Network Training Using Field Programmable Gate Arrays
    Tufa, Guta Tesema
    Andargie, Fitsum Assamnew
    Bijalwan, Anchit
    [J]. Computational Intelligence and Neuroscience, 2022, 2022
  • [50] Femoral Fracture Assessment Using Acceleration Signals Combined with Convolutional Neural Network
    Zhang, Jiqiao
    Zhu, Silang
    Jin, Zihan
    Yang, Wenbin
    Chen, Gongfa
    Cui, Fangsen
    [J]. JOURNAL OF VIBRATION ENGINEERING & TECHNOLOGIES, 2024, 12 (03) : 4987 - 5005