DoubleQExt: Hardware and Memory Efficient CNN Through Two Levels of Quantization

被引:0
|
作者
See, Jin-Chuan [1 ]
Ng, Hui-Fuang [1 ]
Tan, Hung-Khoon [1 ]
Chang, Jing-Jing [1 ]
Lee, Wai-Kong [2 ]
Hwang, Seong Oun [2 ]
机构
[1] Faculty of Information and Communication Technology (FICT), Universiti Tunku Abdul Rahman, Kampar, Petaling Jaya,31900, Malaysia
[2] Department of Computer Engineering, Gachon University, Seongnam,13120, Korea, Republic of
关键词
Convolutional neural network - Deep learning - Hardware - Memory consumption - Memory storage - Memory-management - Power-of-two - Quantisation - Quantization (signal);
D O I
暂无
中图分类号
学科分类号
摘要
To fulfil the tight area and memory constraints in IoT applications, the design of efficient Convolutional Neural Network (CNN) hardware becomes crucial. Quantization of CNN is one of the promising approach that allows the compression of large CNN into a much smaller one, which is very suitable for IoT applications. Among various proposed quantization schemes, Power-of-two (PoT) quantization enables efficient hardware implementation and small memory consumption for CNN accelerators, but requires retraining of CNN to retain its accuracy. This paper proposes a two-level post-training static quantization technique (DoubleQ) that combines the 8-bit and PoT weight quantization. The CNN weight is first quantized to 8-bit (level one), then further quantized to PoT (level two). This allows multiplication to be carried out using shifters, by expressing the weights in their PoT exponent form. DoubleQ also reduces the memory storage requirement for CNN, as only the exponent of the weights is needed for storage. However, DoubleQ trades the accuracy of the network for reduced memory storage. To recover the accuracy, a selection process (DoubleQExt) was proposed to strategically select some of the less informative layers in the network to be quantized with PoT at the second level. On ResNet-20, the proposed DoubleQ can reduce the memory consumption by 37.50% with 7.28% accuracy degradation compared to 8-bit quantization. By applying DoubleQExt, the accuracy is only degraded by 1.19% compared to 8-bit version while achieving a memory reduction of 23.05%. This result is also 1% more accurate than the state-of-the-art work (SegLog). The proposed DoubleQExt also allows flexible configuration to trade off the memory consumption with better accuracy, which is not found in the other state-of-the-art works. With the proposed two-level weight quantization, one can achieve a more efficient hardware architecture for CNN with minimal impact to the accuracy, which is crucial for IoT applications. © 2013 IEEE.
引用
收藏
页码:169082 / 169091
相关论文
共 50 条
  • [31] In-Memory Computing Architecture for Efficient Hardware Security
    Ajmi, Hala
    Zayer, Fakhreddine
    Belgacem, Hamdi
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES, SIGNAL AND IMAGE PROCESSING, ATSIP 2024, 2024, : 71 - 76
  • [32] Efficient Hardware Implementation of Cellular Neural Networks with Incremental Quantization and Early Exit
    Xu, Xiaowei
    Lu, Qing
    Wang, Tianchen
    Hu, Yu
    Zhuo, Chen
    Liu, Jinglan
    Shi, Yiyu
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2018, 14 (04)
  • [33] Boosting Mobile CNN Inference through Semantic Memory
    Li, Yun
    Zhang, Chen
    Han, Shihao
    Zhang, Li Lyna
    Yin, Baoqun
    Liu, Yunxin
    Xu, Mengwei
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2362 - 2371
  • [34] Hybrid Multilevel STT/DSHE Memory for Efficient CNN Training
    Nisar, Arshid
    Nehete, Hemkant
    Verma, Gaurav
    Kaushik, Brajesh Kumar
    IEEE TRANSACTIONS ON ELECTRON DEVICES, 2023, 70 (03) : 1006 - 1013
  • [35] An efficient Image Retrieval through DCT Histogram Quantization
    Mohamed, Aamer
    Khellfi, F.
    Weng, Ying
    Jiang, Jianmin
    Ipson, Stan
    2009 INTERNATIONAL CONFERENCE ON CYBERWORLDS, 2009, : 237 - 240
  • [36] Efficient Dynamic Fixed-Point Quantization of CNN Inference Accelerators for Edge Devices
    Wu, Yueh-Chi
    Huang, Chih-Tsun
    2019 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), 2019,
  • [37] GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training
    Yu, Mingchao
    Lin, Zhifeng
    Narra, Krishna
    Li, Songze
    Li, Youjie
    Kim, Nam Sung
    Schwing, Alexander
    Annavaram, Murali
    Avestimehr, Salman
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [38] Mini Pool : Pooling hardware architecture using minimized local memory for CNN accelerators
    Lee, Eunchong
    Lee, Sang-Seol
    Sung, Minyong
    Jang, Sung-Joon
    Choi, Byoung-Ho
    2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
  • [39] Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware
    Shin, Yongwon
    Park, Juseong
    Hong, Jeongmin
    Sung, Hyojin
    IEEE COMPUTER ARCHITECTURE LETTERS, 2022, 21 (02) : 33 - 36
  • [40] Modelling and hardware implementation of quantization levels of digital cameras in DCT based image compression
    Dixit, Mahendra M.
    Vijaya, C.
    ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2019, 22 (03): : 840 - 853