An interactive network based on transformer for multimodal crowd counting

被引:0
|
作者
Ying Yu
Zhen Cai
Duoqian Miao
Jin Qian
Hong Tang
机构
[1] College of Software Engineering,
[2] Department of Computer Science and Technology,undefined
来源
Applied Intelligence | 2023年 / 53卷
关键词
Crowd counting; Transformer; Multimodal data; Feature fusion;
D O I
暂无
中图分类号
学科分类号
摘要
Crowd counting is a task to estimate the total number of pedestrians in an image. In most of the existing research, good vision problems, such as in parks, squares, and bright shopping malls during the day, have been addressed. However, there is little research on complex scenes in darkness. To study this problem, we propose an interactive network based on Transformer for multi-modal crowd counting. First, sliding convolutional encoding is adopted for the image to obtain better encoding features. The features are extracted through the designed primary interaction network, and then channel token attention is used to modulate the features. Then, the FGAF-MLP is used for high and low semantic fusion to enhance the feature expression and fully fuse the data in different modes to improve the accuracy of the method. To verify the effectiveness of our method, we conducted extensive ablation experiments with the latest multimodal benchmark RGBT-CC, and we verified the complementarity between multiple modal data and the effectiveness of the model components. We also verified the effectiveness of our method with the ShanghaiTechRGBD benchmark. The experimental results showed that our proposed method exhibits good results and achieves an improvement of more than 10%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} in terms of the mean average error and mean squared error for the RGBT-CC benchmark.
引用
收藏
页码:22602 / 22614
页数:12
相关论文
共 50 条
  • [41] Crowd counting in complex scenes based on an attention aware CNN network
    Li, Zhaoxin
    Lu, Shuhua
    Lan, Lingqiang
    Liu, Qiyuan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 87
  • [42] Dense Crowd Counting Network Based on Multi-scale Perception
    Li, Hengchao
    Liu, Xianglian
    Liu, Peng
    Feng, Bin
    Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University, 2024, 59 (05): : 1176 - 1183
  • [43] A survey of crowd counting and density estimation based on convolutional neural network
    Fan, Zizhu
    Zhang, Hong
    Zhang, Zheng
    Lu, Guangming
    Zhang, Yudong
    Wang, Yaowei
    NEUROCOMPUTING, 2022, 472 : 224 - 251
  • [44] Crowd counting method based on the self-attention residual network
    Liu, Yan-Bo
    Jia, Rui-Sheng
    Liu, Qing-Ming
    Zhang, Xing-Li
    Sun, Hong-Mei
    APPLIED INTELLIGENCE, 2021, 51 (01) : 427 - 440
  • [45] Improved Dense Crowd Counting Method based on Residual Neural Network
    Shi J.
    Zhou L.
    Lv G.
    Lin B.
    Journal of Geo-Information Science, 2021, 23 (09): : 1537 - 1547
  • [46] ACCNet: Attention-based Contextual Convolutional Network for Crowd Counting
    Huang, Yaoying
    Zhu, Aichun
    Duan, Guoxiu
    Hu, Fangqiang
    Li, Yifeng
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1926 - 1931
  • [47] Crowd counting method based on the self-attention residual network
    Yan-Bo Liu
    Rui-Sheng Jia
    Qing-Ming Liu
    Xing-Li Zhang
    Hong-Mei Sun
    Applied Intelligence, 2021, 51 : 427 - 440
  • [48] Crowd Counting Based on Multiscale Spatial Guided Perception Aggregation Network
    Chen, Zhangping
    Zhang, Shuo
    Zheng, Xiaoqing
    Zhao, Xiaodong
    Kong, Yaguang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17465 - 17478
  • [49] Progressive Crowd Enhancement De-Background Network for crowd counting
    Wang, Lin
    Li, Jie
    Qi, Chun
    Wang, Fengping
    Wang, Pan
    VISUAL COMPUTER, 2024, : 3695 - 3717
  • [50] Scale Adaptive Enhance Network for Crowd Counting
    Fan, Zirui
    Ruan, Jun
    2022 11TH INTERNATIONAL CONFERENCE ON EDUCATIONAL AND INFORMATION TECHNOLOGY (ICEIT 2022), 2022, : 220 - 225