Efficient Crowd Counting via Dual Knowledge Distillation

被引:2
|
作者
Wang, Rui [1 ]
Hao, Yixue [1 ]
Hu, Long [1 ]
Li, Xianzhi [1 ]
Chen, Min [2 ,3 ]
Miao, Yiming [4 ]
Humar, Iztok [5 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan 430074, Peoples R China
[2] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510640, Peoples R China
[3] Pazhou Lab, Guangzhou 510330, Peoples R China
[4] Chinese Univ Hong Kong, Shenzhen Inst Artiffcial Intelligence & Robot Soc, Sch Data Sci, CUHK Shenzhen, Shenzhen 518172, Guangdong, Peoples R China
[5] Univ Ljubljana, Fac Elect Engn, Ljubljana 1000, Slovenia
关键词
Computational modeling; Adaptation models; Feature extraction; Task analysis; Knowledge transfer; Loss measurement; Estimation; Crowd counting; knowledge transfer; self-knowledge distillation; optimal transport distance; NEURAL-NETWORK;
D O I
10.1109/TIP.2023.3343609
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most researchers focus on designing accurate crowd counting models with heavy parameters and computations but ignore the resource burden during the model deployment. A real-world scenario demands an efficient counting model with low-latency and high-performance. Knowledge distillation provides an elegant way to transfer knowledge from a complicated teacher model to a compact student model while maintaining accuracy. However, the student model receives the wrong guidance with the supervision of the teacher model due to the inaccurate information understood by the teacher in some cases. In this paper, we propose a dual-knowledge distillation (DKD) framework, which aims to reduce the side effects of the teacher model and transfer hierarchical knowledge to obtain a more efficient counting model. First, the student model is initialized with global information transferred by the teacher model via adaptive perspectives. Then, the self-knowledge distillation forces the student model to learn the knowledge by itself, based on intermediate feature maps and target map. Specifically, the optimal transport distance is utilized to measure the difference of feature maps between the teacher and the student to perform the distribution alignment of the counting area. Extensive experiments are conducted on four challenging datasets, demonstrating the superiority of DKD. When there are only approximately 6% of the parameters and computations from the original models, the student model achieves a faster and more accurate counting performance as the teacher model even surpasses it.
引用
收藏
页码:569 / 583
页数:15
相关论文
共 50 条
  • [1] Efficient Crowd Counting via Structured Knowledge Transfer
    Liu, Lingbo
    Chen, Jiaqi
    Wu, Hefeng
    Chen, Tianshui
    Li, Guanbin
    Lin, Liang
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2645 - 2654
  • [2] Improved Knowledge Distillation for Crowd Counting on IoT Devices
    Huang, Zuo
    Sinnott, Richard O.
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON EDGE COMPUTING AND COMMUNICATIONS, EDGE, 2023, : 207 - 214
  • [3] SHUFFLECOUNT: TASK-SPECIFIC KNOWLEDGE DISTILLATION FOR CROWD COUNTING
    Jiang, Minyang
    Lin, Jianzhe
    Wang, Z. Jane
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 999 - 1003
  • [4] Repdistiller: Knowledge Distillation Scaled by Re-parameterization for Crowd Counting
    Ni, Tian
    Cao, Yuchen
    Liang, Xiaoyu
    Hu, Haoji
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 383 - 394
  • [5] FUSIONCOUNT: EFFICIENT CROWD COUNTING VIA MULTISCALE FEATURE FUSION
    Ma, Yiming
    Sanchez, Victor
    Guha, Tanaya
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3256 - 3260
  • [6] Crowd Counting Network with Self-attention Distillation
    Wang, Li
    Zhao, Huailin
    Nie, Zhen
    Li, Yaoyao
    [J]. PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 587 - 591
  • [7] Crowd Counting Network with Self-attention Distillation
    Li, Yaoyao
    Wang, Li
    Zhao, Huailin
    Nie, Zhen
    [J]. JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2020, 7 (02): : 116 - 120
  • [8] Dual-Level Knowledge Distillation via Knowledge Alignment and Correlation
    Ding, Fei
    Yang, Yin
    Hu, Hongxin
    Krovi, Venkat
    Luo, Feng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2425 - 2435
  • [9] Efficient Biomedical Instance Segmentation via Knowledge Distillation
    Liu, Xiaoyu
    Hu, Bo
    Huang, Wei
    Zhang, Yueyi
    Xiong, Zhiwei
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT IV, 2022, 13434 : 14 - 24
  • [10] Dual convolutional neural network for crowd counting
    Guo, Huaping
    Wang, Rui
    Zhang, Li
    Sun, Yange
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (09) : 26687 - 26709