Spatial exchanging fusion network for RGB-T crowd counting

被引:0
|
作者
Rao, Chaoqun [1 ]
Wan, Lin [1 ,2 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430078, Peoples R China
[2] China Univ Geosci, Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430078, Peoples R China
关键词
Crowd counting; Spatial exchanging fusion; Dense prediction; Cross-modal learning; Multi-task learning; BENCHMARK;
D O I
10.1016/j.neucom.2024.128433
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-T crowd counting (RGB-T CC) aims to estimate the crowd population size utilizing the complementary information from visible and thermal images. Current deep models for RGB-T CC typically adopt a three-tier architecture, featuring a middle fusion layer that aggregates both RGB and thermal streams. However, we find that this dedicated fusion layer dominates the training process, causing under-optimization of both modal branches, which becomes the performance bottleneck in mainstream multi-modal counting models. To address this challenge, we propose a simple-yet-effective counting architecture, the Spatial Exchanging Fusion Network (SEFNet). It is built on a Dual Attention Guided Spatial Exchanging (DASE) mechanism, enabling direct extraction and exchange of modality-complementary features between modalities without the extra fusion branch employed in most existing works. This design ensures a more balanced gradient back-propagation over networks, attaining optimized representations in multi-modality fusion over prior models. Besides, the Modality Gradient Enhancement Module (MGEM) in SEFNet can effectively learn modality-specific crowd representations with two counting sub-tasks, dynamically achieving better gradient distribution and further enhancing optimization in both modalities. Extensive experiments demonstrate that SEFNet significantly outperforms state-of-the-art methods on mainstream benchmark datasets, and also exhibits promising generalization ability across various counting backbones and losses.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] CONDITIONAL RGB-T FUSION FOR EFFECTIVE CROWD COUNTING
    Pahwa, Esha
    Kapadia, Sanjeet
    Luthra, Achleshwar
    Sheeranali, Shreyas
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 376 - 380
  • [2] Light-sensitive and adaptive fusion network for RGB-T crowd counting
    Huang, Liangjun
    Kang, Wencan
    Chen, Guangkai
    Zhang, Qing
    Zhang, Jianwei
    [J]. VISUAL COMPUTER, 2024, 40 (10): : 7279 - 7292
  • [3] TAFNet: A Three-Stream Adaptive Fusion Network for RGB-T Crowd Counting
    Tang, Haihan
    Wang, Yi
    Chau, Lap-Pui
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 3299 - 3303
  • [4] CrowdFusion: Refined Cross-Modal Fusion Network for RGB-T Crowd Counting
    Cai, Jialu
    Wang, Qing
    Jiang, Shengqin
    [J]. BIOMETRIC RECOGNITION, CCBR 2023, 2023, 14463 : 427 - 436
  • [5] Daacfnet: Discriminative Activation and Adjacent Context Fusion Network for Rgb-T Crowd Counting
    Xie, Zhengxuan
    Shao, Feng
    Mu, Baoyang
    Chen, Hangwei
    [J]. SSRN, 2024,
  • [6] DEFNet: Dual-Branch Enhanced Feature Fusion Network for RGB-T Crowd Counting
    Zhou, Wujie
    Pan, Yi
    Lei, Jingsheng
    Ye, Lv
    Yu, Lu
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 24540 - 24549
  • [7] CAGNet: Coordinated attention guidance network for RGB-T crowd counting
    Yang, Xun
    Zhou, Wujie
    Yan, Weiqing
    Qian, Xiaohong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 243
  • [8] BGDFNet: Bidirectional Gated and Dynamic Fusion Network for RGB-T Crowd Counting in Smart City System
    Xie, Zhengxuan
    Shao, Feng
    Mu, Baoyang
    Chen, Hangwei
    Jiang, Qiuping
    Lu, Chenyang
    Ho, Yo-Sung
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [9] A unified RGB-T crowd counting learning framework
    Gu, Siqi
    Lian, Zhichao
    [J]. IMAGE AND VISION COMPUTING, 2023, 131
  • [10] CGINet: Cross-modality grade interaction network for RGB-T crowd counting
    Pan, Yi
    Zhou, Wujie
    Qian, Xiaohong
    Mao, Shanshan
    Yang, Rongwang
    Yu, Lu
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126