Efficient Crowd Counting via Dual Knowledge Distillation

被引:2
|
作者
Wang, Rui [1 ]
Hao, Yixue [1 ]
Hu, Long [1 ]
Li, Xianzhi [1 ]
Chen, Min [2 ,3 ]
Miao, Yiming [4 ]
Humar, Iztok [5 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan 430074, Peoples R China
[2] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510640, Peoples R China
[3] Pazhou Lab, Guangzhou 510330, Peoples R China
[4] Chinese Univ Hong Kong, Shenzhen Inst Artiffcial Intelligence & Robot Soc, Sch Data Sci, CUHK Shenzhen, Shenzhen 518172, Guangdong, Peoples R China
[5] Univ Ljubljana, Fac Elect Engn, Ljubljana 1000, Slovenia
关键词
Computational modeling; Adaptation models; Feature extraction; Task analysis; Knowledge transfer; Loss measurement; Estimation; Crowd counting; knowledge transfer; self-knowledge distillation; optimal transport distance; NEURAL-NETWORK;
D O I
10.1109/TIP.2023.3343609
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most researchers focus on designing accurate crowd counting models with heavy parameters and computations but ignore the resource burden during the model deployment. A real-world scenario demands an efficient counting model with low-latency and high-performance. Knowledge distillation provides an elegant way to transfer knowledge from a complicated teacher model to a compact student model while maintaining accuracy. However, the student model receives the wrong guidance with the supervision of the teacher model due to the inaccurate information understood by the teacher in some cases. In this paper, we propose a dual-knowledge distillation (DKD) framework, which aims to reduce the side effects of the teacher model and transfer hierarchical knowledge to obtain a more efficient counting model. First, the student model is initialized with global information transferred by the teacher model via adaptive perspectives. Then, the self-knowledge distillation forces the student model to learn the knowledge by itself, based on intermediate feature maps and target map. Specifically, the optimal transport distance is utilized to measure the difference of feature maps between the teacher and the student to perform the distribution alignment of the counting area. Extensive experiments are conducted on four challenging datasets, demonstrating the superiority of DKD. When there are only approximately 6% of the parameters and computations from the original models, the student model achieves a faster and more accurate counting performance as the teacher model even surpasses it.
引用
收藏
页码:569 / 583
页数:15
相关论文
共 50 条
  • [31] Crowd counting via an inverse attention residual network
    Liu, Yan-Bo
    Jia, Rui-Sheng
    Liu, Qing-Ming
    Xu, Zhi-Feng
    Sun, Hong-Mei
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (03)
  • [32] Efficient Neural Data Compression for Machine Type Communications via Knowledge Distillation
    Hussien, Mostafa
    Xu, Yi Tian
    Wu, Di
    Liu, Xue
    Dudek, Gregory
    [J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1169 - 1174
  • [33] Towards Energy Efficient DNN accelerator via Sparsified Gradual Knowledge Distillation
    Karimzadeh, Foroozan
    Raychowdhury, Arijit
    [J]. PROCEEDINGS OF THE 2022 IFIP/IEEE 30TH INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2022,
  • [34] Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup
    Xu, Guodong
    Liu, Ziwei
    Loy, Chen Change
    [J]. PATTERN RECOGNITION, 2023, 138
  • [35] Counting with the Crowd
    Marcus, Adam
    Karger, David
    Madden, Samuel
    Miller, Robert
    Oh, Sewoong
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 6 (02): : 109 - 120
  • [36] Efficient crowd density estimation with edge intelligence via structural reparameterization and knowledge transfer☆
    Lin, Chenxi
    Hu, Xiaojian
    [J]. APPLIED SOFT COMPUTING, 2024, 154
  • [37] An efficient semi-supervised manifold embedding for crowd counting
    Zhang, Kaibing
    Wang, Huake
    Liu, Wei
    Li, Minqi
    Lu, Jian
    Liu, Zhonghua
    [J]. APPLIED SOFT COMPUTING, 2020, 96
  • [38] Offset-decoupled deformable convolution for efficient crowd counting
    Zhong, Xin
    Qin, Jing
    Guo, Mingyue
    Zuo, Wangmeng
    Lu, Weigang
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [39] Offset-decoupled deformable convolution for efficient crowd counting
    Xin Zhong
    Jing Qin
    Mingyue Guo
    Wangmeng Zuo
    Weigang Lu
    [J]. Scientific Reports, 12
  • [40] Efficient and Switchable CNN for Crowd Counting Based on Embedded Terminal
    Chen, Jingyu
    Zhang, Qiong
    Zheng, Wei-Shi
    Xie, Xiaohua
    [J]. IEEE ACCESS, 2019, 7 : 51533 - 51541