Multi-scale Supervised Attentive Encoder-Decoder Network for Crowd Counting

被引:8
|
作者
Zhang, Anran [1 ]
Jiang, Xiaolong [1 ]
Zhang, Baochang [2 ]
Cao, Xianbin [1 ,3 ,4 ]
机构
[1] Beihang Univ, Sch Elect & Informat Engn, XueYuan Rd 37, Beijing, Peoples R China
[2] Beihang Univ, Sch Automat Sci & Elect Engn, XueYuan Rd 37, Beijing, Peoples R China
[3] Beihang Univ, Key Lab Adv Technol Near Space Informat Syst, Minist Ind & Informat Technol China, XueYuan Rd 37, Beijing, Peoples R China
[4] Beijing Adv Innovat Ctr Big Data Based Precis Med, XueYuan Rd 37, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-attention; representation fusion; supervised method; HUMANS; SEGMENTATION; TRACKING; MULTIPLE; IMAGE;
D O I
10.1145/3356019
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Crowd counting is a popular topic with widespread applications. Currently, the biggest challenge to crowd counting is large-scale variation in objects. In this article, we focus on overcoming this challenge by proposing a novel Attentive Encoder-Decoder Network (AEDN), which is supervised on multiple feature scales to conduct crowd counting via density estimation. This work has three main contributions. First, we augment the traditional encoder-decoder architecture with our proposed residual attention blocks, which, beyond skip-connected encoded features, further extend the decoded features with attentive features. AEDN is better at establishing long-range dependencies between the encoder and decoder, therefore promoting more effective fusion of multi-scale features for handling scale-variations. Second, we design a new KL-divergence-based distribution loss to supervise the scale-aware structural differences between two density maps, which complements the pixel-isolated MSE loss and better optimizes AEDN to generate high-quality density maps. Third, we adopt a multi-scale supervision scheme, such that multiple KL divergences and MSE losses are deployed at all decoding stages, providing more thorough supervisions for different feature scales. Extensive experimental results on four public datasets, including ShanghaiTech Part A, ShanghaiTech Part B, UCF-CC-50, and UCF-QNRF, reveal the superiority and efficacy of the proposed method, which outperforms most state-of-the-art competitors.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Counting in congested crowd scenes with hierarchical scale-aware encoder–decoder network
    Han, Run
    Qi, Ran
    Lu, Xuequan
    Huang, Lei
    Lyu, Lei
    [J]. Expert Systems with Applications, 2024, 238
  • [32] Multi-Supervised Encoder-Decoder for Image Forgery Localization
    Yu, Chunfang
    Zhou, Jizhe
    Li, Qin
    [J]. ELECTRONICS, 2021, 10 (18)
  • [33] Multi-scale fusion residual encoder-decoder approach for low illumination image enhancement
    Pan Xiaoying
    Wei Miao
    Wang Hao
    Jia Fengzhu
    [J]. The Journal of China Universities of Posts and Telecommunications, 2022, (02) : 63 - 72
  • [34] Encoder-Decoder Networks for Retinal Vessel Segmentation Using Large Multi-scale Patches
    Browatzki, Bjoern
    Lies, Joern-Philipp
    Wallraven, Christian
    [J]. OPHTHALMIC MEDICAL IMAGE ANALYSIS, OMIA 2020, 2020, 12069 : 42 - 52
  • [35] A Traffic Surveillance Multi-Scale Vehicle Detection Object Method Base on Encoder-Decoder
    Hong, Feng
    Lu, Chang-Hua
    Liu, Chun
    Liu, Ru-Ru
    Wei, Ju
    [J]. IEEE ACCESS, 2020, 8 : 47664 - 47674
  • [36] A Multi-Scale Fusion Residual Encoder-Decoder Approach for Low Illumination Image Enhancement
    Pan, Xiaoying
    Wei, Miao
    Wang, Hao
    Jia, Fengzhu
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (01): : 104 - 112
  • [37] Multi-scale fusion residual encoder-decoder approach for low illumination image enhancement
    Xiaoying, Pan
    Miao, Wei
    Hao, Wang
    Fengzhü, Jia
    [J]. Journal of China Universities of Posts and Telecommunications, 2022, 29 (02): : 63 - 72
  • [38] MULTI-STEP QUANTIZATION OF A MULTI-SCALE NETWORK FOR CROWD COUNTING
    Shim, Kyujin
    Byun, Junyoung
    Kim, Changick
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 683 - 687
  • [39] Crowd Counting Method Based on Multi-Scale Enhanced Network
    Xu Tao
    Duan Yinong
    Du Jiahao
    Liu Caihua
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1764 - 1771
  • [40] MSIANet: Multi-scale Interactive Attention Crowd Counting Network
    Zhang, Shihui
    Zhao, Weibo
    Wang, Lei
    Wang, Wei
    Li, Qunpeng
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (06) : 2236 - 2245