A Global-Local Dual Branch Network for Congested Crowd Counting

被引:0
|
作者
Di H. [1 ]
Song L. [1 ]
Yu X. [1 ]
Wang W. [1 ]
机构
[1] School of Computer Science and Technology, Beijing Institute of Technology, Beijing
关键词
crowd density estimation; deep learning; multi-scale learning; visual attention;
D O I
10.15918/j.tbit1001-0645.2021.311
中图分类号
学科分类号
摘要
Convolutional neural network based crowd counting methods have promoted a significant improvement in the accuracy of crowd counting. However, for congested crowd, huge scale variations of crowd heads and complex scenes still hinder the accuracy of crowd counting. In order to overcome this problem a global-local dual branch network was proposed. The local branch was arranged with the proposed scale-aware feature extraction modules to model the scale changes of the heads in congested crowds. The global branch was arranged with a localization-aware attention module to enhance the network's ability to discriminate between the crowd and the background objects. Then the extracted local features and global features were sent to the feature fusion branch to produce a crowd density map. The proposed method was evaluated on three commonly-used crowd counting datasets and one remote sensing object counting dataset. The quantitative and qualitative results show the effectiveness of the proposed method. © 2022 Beijing Institute of Technology. All rights reserved.
引用
收藏
页码:1175 / 1183
页数:8
相关论文
共 25 条
  • [1] LI J, LIANG X, SHEN S M, Et al., Scale-aware fast R-CNN for pedestrian detection[J], IEEE Transactions on Multimedia, 20, 4, pp. 985-996, (2017)
  • [2] LIU W, LIAO S, REN W, Et al., High-level semantic feature detection: A new perspective for pedestrian detection, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5187-5196, (2019)
  • [3] LIU B, VASCONCELOS N., Bayesian model adaptation for crowd counts, Proceedings of the IEEE International Conference on Computer Vision, pp. 4175-4183, (2015)
  • [4] CHANG Liang, BAI Jie, HU Huihui, Et al., MDF-ANet: vision fusion semantic segmentation for low-light autonomous driving, Transactions of Beijing Institute of Technology, 42, 1, pp. 97-104, (2022)
  • [5] HAN Zishuo, WANG Chunping, FU Qiang, Ship detection in SAR images based on deep feature enhancement network, Transactions of Beijing Institute of Technology, 41, 9, pp. 1006-1014, (2021)
  • [6] ZHANG Y, ZHOU D, CHEN S, Et al., Single-image crowd counting via multi-column convolutional neural network[C], Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 589-597, (2016)
  • [7] SINDAGI V A, PATEL V M., Generating high-quality crowd density maps using contextual pyramid CNNS[C], Proceedings of the IEEE International Conference on Computer Vision, pp. 1861-1870, (2017)
  • [8] BABU SAM D, SURYA S, VENKATESH BABU R., Switching convolutional neural network for crowd counting [C], Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 5744-5752, (2017)
  • [9] LI Y, ZHANG X, CHEN D., Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes [C], Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1091-1100, (2018)
  • [10] XIA G S, BAI X, DING J, Et al., DOTA: A large-scale dataset for object detection in aerial images, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3974-3983, (2018)