Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

被引:36
|
作者
Thanasutives, Pongpisit [1 ]
Fukui, Ken-ichi [2 ]
Numao, Masayuki [2 ]
Kijsirikul, Boonserm [3 ]
机构
[1] Osaka Univ, Grad Sch Informat Sci & Technol, Suita, Osaka, Japan
[2] Osaka Univ, Inst Sci & Ind Res, Suita, Osaka, Japan
[3] Chulalongkorn Univ, Fac Engn, Dept Comp Engn, Bangkok, Thailand
关键词
D O I
10.1109/ICPR48806.2021.9413286
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose two modified neural networks based on dual path multi-scale fusion networks (SFANet) and SegNet for accurate and efficient crowd counting. Inspired by SFANet, the first model, which is named M-SFANet, is attached with atrous spatial pyramid pooling (ASPP) and context-aware module (CAN). The encoder of M-SFANet is enhanced with ASPP containing parallel atrous convolutional layers with different sampling rates and hence able to extract multi-scale features of the target object and incorporate larger context. To further deal with scale variation throughout an input image, we leverage the CAN module which adaptively encodes the scales of the contextual information. The combination yields an effective model for counting in both dense and sparse crowd scenes. Based on the SFANet decoder structure, M-SFANet's decoder has dual paths, for density map and attention map generation. The second model is called M-SegNet, which is produced by replacing the bilinear upsampling in SFANet with max unpooling that is used in SegNet. This change provides a faster model while providing competitive counting performance. Designed for high-speed surveillance applications, M-SegNet has no additional multi-scale-aware module in order to not increase the complexity. Both models are encoder-decoder based architectures and are end-to-end trainable. We conduct extensive experiments on five crowd counting datasets and one vehicle counting dataset to show that these modifications yield algorithms that could improve state-of-the-art crowd counting methods. Codes are available at https://github.com/Pongpisit-Thanasuaves/Variations-of-SFANet-for-Crowd-Counting.
引用
收藏
页码:2382 / 2389
页数:8
相关论文
共 50 条
  • [31] Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning
    Chen, Jingwen
    Pan, Yingwei
    Li, Yehao
    Yao, Ting
    Chao, Hongyang
    Mei, Tao
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
  • [32] Detection of black box signal based on encoder-decoder fully convolutional networks
    Ji, Huazhong
    Zhou, Jie
    Pan, Xiang
    [J]. GLOBAL OCEANS 2020: SINGAPORE - U.S. GULF COAST, 2020,
  • [33] Semantic Translation with Convolutional Encoder-decoder Networks for Viewpoint Estimation
    Zhang, Liangjun
    Gu, Changjian
    Gu, Chaochen
    Wu, Kaijie
    Guan, Xinping
    [J]. 2017 11TH ASIAN CONTROL CONFERENCE (ASCC), 2017, : 1660 - 1665
  • [34] Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
    Chen, Jingwen
    Pan, Yingwei
    Li, Yehao
    Yao, Ting
    Chao, Hongyang
    Mei, Tao
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8167 - 8174
  • [35] Encoder-decoder with densely convolutional networks for monocular depth estimation
    Chen, Songnan
    Tang, Mengxia
    Kan, Jiangming
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2019, 36 (10) : 1709 - 1718
  • [36] Fetal electrocardiography extraction with residual convolutional encoder-decoder networks
    Zhong, Wei
    Liao, Lijuan
    Guo, Xuemei
    Wang, Guoli
    [J]. AUSTRALASIAN PHYSICAL & ENGINEERING SCIENCES IN MEDICINE, 2019, 42 (04) : 1081 - 1089
  • [37] A Multi-scale Edge Detection Method Based on Encoder-Decoder
    Tian, An-Lin
    Lei, Wei-Min
    Zhang, Peng
    Zhang, Wei
    [J]. Dongbei Daxue Xuebao/Journal of Northeastern University, 2024, 45 (07): : 936 - 943
  • [38] Deep encoder-decoder hierarchical convolutional neural networks for conjugate heat transfer surrogate modeling
    Ebbs-Picken, Takiah
    Romero, David A.
    Da Silva, Carlos M.
    Amon, Cristina H.
    [J]. APPLIED ENERGY, 2024, 372
  • [39] Cloud and Snow Segmentation in Satellite Images Using an Encoder-Decoder Deep Convolutional Neural Networks
    Zheng, Kai
    Li, Jiansheng
    Ding, Lei
    Yang, Jianfeng
    Zhang, Xucheng
    Zhang, Xun
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (07)
  • [40] MULTI-STEP CHORD SEQUENCE PREDICTION BASED ON AGGREGATED MULTI-SCALE ENCODER-DECODER NETWORKS
    Carsault, Tristan
    McLeod, Andrew
    Esling, Philippe
    Nika, Jerome
    Nakamura, Eita
    Yoshii, Kazuyoshi
    [J]. 2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,