Multi-scale Supervised Attentive Encoder-Decoder Network for Crowd Counting

被引:8
|
作者
Zhang, Anran [1 ]
Jiang, Xiaolong [1 ]
Zhang, Baochang [2 ]
Cao, Xianbin [1 ,3 ,4 ]
机构
[1] Beihang Univ, Sch Elect & Informat Engn, XueYuan Rd 37, Beijing, Peoples R China
[2] Beihang Univ, Sch Automat Sci & Elect Engn, XueYuan Rd 37, Beijing, Peoples R China
[3] Beihang Univ, Key Lab Adv Technol Near Space Informat Syst, Minist Ind & Informat Technol China, XueYuan Rd 37, Beijing, Peoples R China
[4] Beijing Adv Innovat Ctr Big Data Based Precis Med, XueYuan Rd 37, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-attention; representation fusion; supervised method; HUMANS; SEGMENTATION; TRACKING; MULTIPLE; IMAGE;
D O I
10.1145/3356019
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Crowd counting is a popular topic with widespread applications. Currently, the biggest challenge to crowd counting is large-scale variation in objects. In this article, we focus on overcoming this challenge by proposing a novel Attentive Encoder-Decoder Network (AEDN), which is supervised on multiple feature scales to conduct crowd counting via density estimation. This work has three main contributions. First, we augment the traditional encoder-decoder architecture with our proposed residual attention blocks, which, beyond skip-connected encoded features, further extend the decoded features with attentive features. AEDN is better at establishing long-range dependencies between the encoder and decoder, therefore promoting more effective fusion of multi-scale features for handling scale-variations. Second, we design a new KL-divergence-based distribution loss to supervise the scale-aware structural differences between two density maps, which complements the pixel-isolated MSE loss and better optimizes AEDN to generate high-quality density maps. Third, we adopt a multi-scale supervision scheme, such that multiple KL divergences and MSE losses are deployed at all decoding stages, providing more thorough supervisions for different feature scales. Extensive experimental results on four public datasets, including ShanghaiTech Part A, ShanghaiTech Part B, UCF-CC-50, and UCF-QNRF, reveal the superiority and efficacy of the proposed method, which outperforms most state-of-the-art competitors.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] An Effective Lightweight Crowd Counting Method Based on an Encoder-Decoder Network for Internet of Video Things
    Yi, Jun
    Chen, Fan
    Shen, Zhilong
    Xiang, Yi
    Xiao, Shan
    Zhou, Wei
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (02) : 3082 - 3094
  • [22] Multi-scale features fused network with multi-level supervised path for crowd counting
    Wang, Yongjie
    Zhang, Wei
    Huang, Dongxiao
    Liu, Yanyan
    Zhu, Jianghua
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 200
  • [23] Attentive U-recurrent encoder-decoder network for image dehazing
    Yin, Shibai
    Wang, Yibin
    Yang, Yee-Hong
    [J]. NEUROCOMPUTING, 2021, 437 : 143 - 156
  • [24] DHA-Net: An encoder-decoder network fusing multi-scale features for optic disc segmentation
    Zheng, Xuan
    He, Yi
    Yuan, Huaqing
    Jiang, Yanglin
    Xu, Yanbin
    [J]. 2023 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE, I2MTC, 2023,
  • [25] Attentive Generative Adversarial Network with Dual Encoder-Decoder for Shadow Removal
    Wang, He
    Zou, Hua
    Zhang, Dengyi
    [J]. INFORMATION, 2022, 13 (08)
  • [26] PMED-Net: Pyramid Based Multi-Scale Encoder-Decoder Network for Medical Image Segmentation
    Khan, Abbas
    Kim, Hyongsuk
    Chua, Leon
    [J]. IEEE ACCESS, 2021, 9 : 55988 - 55998
  • [27] SENetCount: An Optimized Encoder-Decoder Architecture with Squeeze-and-Excitation for Crowd Counting
    Meng, Xiaolong
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [28] Redesigning Multi-Scale Neural Network for Crowd Counting
    Du, Zhipeng
    Shi, Miaojing
    Deng, Jiankang
    Zafeiriou, Stefanos
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3664 - 3678
  • [29] MobileCount: An efficient encoder-decoder framework for real-time crowd counting
    Wang, Peng
    Gao, Chenyu
    Wang, Yang
    Li, Hui
    Gao, Ye
    [J]. NEUROCOMPUTING, 2020, 407 : 292 - 299
  • [30] Multi-Scale Guided Attention Network for Crowd Counting
    Li, Pengfei
    Zhang, Min
    Wan, Jian
    Jiang, Ming
    [J]. SCIENTIFIC PROGRAMMING, 2021, 2021