Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting

被引：36

作者：

Thanasutives, Pongpisit ^{[1
]}

Fukui, Ken-ichi ^{[2
]}

Numao, Masayuki ^{[2
]}

Kijsirikul, Boonserm ^{[3
]}

机构：

[1] Osaka Univ, Grad Sch Informat Sci & Technol, Suita, Osaka, Japan

[2] Osaka Univ, Inst Sci & Ind Res, Suita, Osaka, Japan

[3] Chulalongkorn Univ, Fac Engn, Dept Comp Engn, Bangkok, Thailand

来源：

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2021年

关键词：

D O I：

10.1109/ICPR48806.2021.9413286

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose two modified neural networks based on dual path multi-scale fusion networks (SFANet) and SegNet for accurate and efficient crowd counting. Inspired by SFANet, the first model, which is named M-SFANet, is attached with atrous spatial pyramid pooling (ASPP) and context-aware module (CAN). The encoder of M-SFANet is enhanced with ASPP containing parallel atrous convolutional layers with different sampling rates and hence able to extract multi-scale features of the target object and incorporate larger context. To further deal with scale variation throughout an input image, we leverage the CAN module which adaptively encodes the scales of the contextual information. The combination yields an effective model for counting in both dense and sparse crowd scenes. Based on the SFANet decoder structure, M-SFANet's decoder has dual paths, for density map and attention map generation. The second model is called M-SegNet, which is produced by replacing the bilinear upsampling in SFANet with max unpooling that is used in SegNet. This change provides a faster model while providing competitive counting performance. Designed for high-speed surveillance applications, M-SegNet has no additional multi-scale-aware module in order to not increase the complexity. Both models are encoder-decoder based architectures and are end-to-end trainable. We conduct extensive experiments on five crowd counting datasets and one vehicle counting dataset to show that these modifications yield algorithms that could improve state-of-the-art crowd counting methods. Codes are available at https://github.com/Pongpisit-Thanasuaves/Variations-of-SFANet-for-Crowd-Counting.

引用

页码：2382 / 2389

页数：8

共 50 条

[31] Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning
Chen, Jingwen
Pan, Yingwei
Li, Yehao
Yao, Ting
Chao, Hongyang
Mei, Tao
[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
[32] Detection of black box signal based on encoder-decoder fully convolutional networks
Ji, Huazhong
Zhou, Jie
Pan, Xiang
[J]. GLOBAL OCEANS 2020: SINGAPORE - U.S. GULF COAST, 2020,
[33] Semantic Translation with Convolutional Encoder-decoder Networks for Viewpoint Estimation
Zhang, Liangjun
Gu, Changjian
Gu, Chaochen
Wu, Kaijie
Guan, Xinping
[J]. 2017 11TH ASIAN CONTROL CONFERENCE (ASCC), 2017, : 1660 - 1665
[34] Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Chen, Jingwen
Pan, Yingwei
Li, Yehao
Yao, Ting
Chao, Hongyang
Mei, Tao
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8167 - 8174
[35] Encoder-decoder with densely convolutional networks for monocular depth estimation
Chen, Songnan
Tang, Mengxia
Kan, Jiangming
[J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2019, 36 (10) : 1709 - 1718
[36] Fetal electrocardiography extraction with residual convolutional encoder-decoder networks
Zhong, Wei
Liao, Lijuan
Guo, Xuemei
Wang, Guoli
[J]. AUSTRALASIAN PHYSICAL & ENGINEERING SCIENCES IN MEDICINE, 2019, 42 (04) : 1081 - 1089
[37] A Multi-scale Edge Detection Method Based on Encoder-Decoder
Tian, An-Lin
Lei, Wei-Min
Zhang, Peng
Zhang, Wei
[J]. Dongbei Daxue Xuebao/Journal of Northeastern University, 2024, 45 (07): : 936 - 943
[38] Deep encoder-decoder hierarchical convolutional neural networks for conjugate heat transfer surrogate modeling
Ebbs-Picken, Takiah
Romero, David A.
Da Silva, Carlos M.
Amon, Cristina H.
[J]. APPLIED ENERGY, 2024, 372
[39] Cloud and Snow Segmentation in Satellite Images Using an Encoder-Decoder Deep Convolutional Neural Networks
Zheng, Kai
Li, Jiansheng
Ding, Lei
Yang, Jianfeng
Zhang, Xucheng
Zhang, Xun
[J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (07)
[40] MULTI-STEP CHORD SEQUENCE PREDICTION BASED ON AGGREGATED MULTI-SCALE ENCODER-DECODER NETWORKS
Carsault, Tristan
McLeod, Andrew
Esling, Philippe
Nika, Jerome
Nakamura, Eita
Yoshii, Kazuyoshi
[J]. 2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,

← 1 2 3 4 5 →