Long-range zero-shot generative deep network quantization

被引:2
|
作者
Luo, Yan [1 ]
Gao, Yangcheng [1 ]
Zhang, Zhao [1 ,3 ]
Fan, Jicong [2 ,3 ]
Zhang, Haijun [4 ]
Xu, Mingliang [5 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Chinese Univ Hong Kong, Sch Data Sci, Shenzhen, Peoples R China
[3] Shenzhen Res Inst Big Data, Shenzhen, Peoples R China
[4] Harbin Inst Technol, Sch Comp Sci, Shenzhen, Peoples R China
[5] Zhengzhou Univ, Sch Informat Engn, Zhengzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep network quantization; Long-range generator; Adversarial margin add; Synthetic data generation;
D O I
10.1016/j.neunet.2023.07.042
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Quantization approximates a deep network model with floating-point numbers by the model with low bit width numbers, thereby accelerating inference and reducing computation. Zero-shot quantization, which aims to quantize a model without access to the original data, can be achieved by fitting the real data distribution through data synthesis. However, it has been observed that zero-shot quantization leads to inferior performance compared to post-training quantization with real data for two primary reasons: 1) a normal generator has difficulty obtaining a high diversity of synthetic data since it lacks long-range information to allocate attention to global features, and 2) synthetic images aim to simulate the statistics of real data, which leads to weak intra-class heterogeneity and limited feature richness. To overcome these problems, we propose a novel deep network quantizer called long-range zero-shot generative deep network quantization (LRQ). Technically, we propose a long-range generator (LRG) to learn long-range information instead of simple local features. To incorporate more global features into the synthetic data, we use long-range attention with large-kernel convolution in the generator. In addition, we also present an adversarial margin add (AMA) module to force intra-class angular enlargement between the feature vector and class center. The AMA module forms an adversarial process that increases the convergence difficulty of the loss function, which is opposite to the training objective of the original loss function. Furthermore, to transfer knowledge from the full-precision network, we also utilize decoupled knowledge distillation. Extensive experiments demonstrate that LRQ obtains better performance than other competitors.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:683 / 691
页数:9
相关论文
共 50 条
  • [21] Bidirectional generative transductive zero-shot learning
    Li, Xinpeng
    Zhang, Dan
    Ye, Mao
    Li, Xue
    Dou, Qiang
    Lv, Qiao
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10): : 5313 - 5326
  • [22] A Joint Generative Model for Zero-Shot Learning
    Gao, Rui
    Hou, Xingsong
    Qin, Jie
    Liu, Li
    Zhu, Fan
    Zhang, Zhao
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV, 2019, 11132 : 631 - 646
  • [23] Generative Mixup Networks for Zero-Shot Learning
    Xu, Bingrong
    Zeng, Zhigang
    Lian, Cheng
    Ding, Zhengming
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022,
  • [24] Bidirectional generative transductive zero-shot learning
    Xinpeng Li
    Dan Zhang
    Mao Ye
    Xue Li
    Qiang Dou
    Qiao Lv
    Neural Computing and Applications, 2021, 33 : 5313 - 5326
  • [25] Zero-Shot Learning via Class-Conditioned Deep Generative Models
    Wang, Wenlin
    Pu, Yunchen
    Verma, Vinay Kumar
    Fan, Kai
    Zhang, Yizhe
    Chen, Changyou
    Rai, Piyush
    Carin, Lawrence
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4211 - 4218
  • [26] Augmented semantic feature based generative network for generalized zero-shot learning
    Li, Zhiqun
    Chen, Qiong
    Liu, Qingfa
    NEURAL NETWORKS, 2021, 143 : 1 - 11
  • [27] Zero-Shot Learning via Structure-Aligned Generative Adversarial Network
    Tang, Chenwei
    He, Zhenan
    Li, Yunxia
    Lv, Jiancheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6749 - 6762
  • [28] Zero-shot Entity Linking with Efficient Long Range Sequence Modeling
    Yao, Zonghai
    Cao, Liangliang
    Pan, Huapu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2517 - 2522
  • [29] Zero-Shot Deep Domain Adaptation
    Peng, Kuan-Chuan
    Wu, Ziyan
    Ernst, Jan
    COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 : 793 - 810
  • [30] ADEQ: Adaptive Diversity Enhancement for Zero-Shot Quantization
    Chen, Xinrui
    Yan, Renao
    Cheng, Junru
    Wang, Yizhi
    Fu, Yuqiu
    Chen, Yi
    Guan, Tian
    He, Yonghong
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT I, 2024, 14447 : 53 - 64