Cross-modality complementary information fusion for multispectral pedestrian detection

被引:12
|
作者
Yan, Chaoqi [1 ]
Zhang, Hong [1 ]
Li, Xuliang [1 ]
Yang, Yifan [2 ]
Yuan, Ding [1 ]
机构
[1] Beihang Univ, Image Proc Ctr, 37 Xueyuan Rd, Beijing 100191, Peoples R China
[2] Beihang Univ, Inst Artificial Intelligence, 37 Xueyuan Rd, Beijing 100191, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 14期
基金
中国国家自然科学基金;
关键词
Multispectral pedestrian detection; Cross-modality; Information fusion; Illumination-aware; Feature alignment; DEEP NEURAL-NETWORKS;
D O I
10.1007/s00521-023-08239-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multispectral pedestrian detection has received increasing attention in recent years as color and thermal modalities can provide complementary visual information, especially under insufficient illumination conditions. However, there is still a persistent crucial problem that how to design the cross-modality fusion mechanism to fully exploit the complementary characteristics between different modalities. In this paper, we propose a novel cross-modality complementary information fusion network (denoted as CCIFNet) to comprehensively capture the long-range interactions with precise positional information and meanwhile preserve the inter-spatial relationship between different modalities in the feature extraction stage. Further, we design an adaptive illumination-aware weight generation module to adaptively weight the final detection confidence of color and thermal modalities by taking various illumination conditions into consideration. Specifically, we comprehensively compare three different fusion strategies about this module to synthetically explore the best way for generating the final illumination-aware fusion weights. Finally, we present a simple but effective feature alignment module to alleviate the position shift problem caused by the weakly aligned color-thermal image pairs. Extensive experiments and ablation studies on KAIST, CVC-14, FLIR and LLVIP multispectral object detection datasets show that the proposed CCIFNet can achieve state-of-the-art performance under different illumination evaluation settings, while keeping a competitive speed-accuracy trade-off for real-time applications.
引用
收藏
页码:10361 / 10386
页数:26
相关论文
共 50 条
  • [21] Incremental Cross-Modality Deep Learning for Pedestrian Recognition
    Pop, Danut Ovidiu
    Rogozan, Alexandrina
    Nashashibi, Fawzi
    Bensrhair, Abdelaziz
    2017 28TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV 2017), 2017, : 523 - 528
  • [22] Cascaded information enhancement and cross-modal attention feature fusion for multispectral pedestrian detection
    Yang, Yang
    Xu, Kaixiong
    Wang, Kaizheng
    FRONTIERS IN PHYSICS, 2023, 11
  • [23] CROSS-MODALITY TRANSFER OF SPATIAL INFORMATION
    FISHBEIN, HD
    DECKER, J
    WILCOX, P
    BRITISH JOURNAL OF PSYCHOLOGY, 1977, 68 (NOV) : 503 - 508
  • [24] Cross-modality interaction for few-shot multispectral object detection with semantic knowledge
    Huang, Lian
    Peng, Zongju
    Chen, Fen
    Dai, Shaosheng
    He, Ziqiang
    Liu, Kesheng
    NEURAL NETWORKS, 2024, 173
  • [25] Cross-modality interaction for few-shot multispectral object detection with semantic knowledge
    Huang, Lian
    Peng, Zongju
    Chen, Fen
    Dai, Shaosheng
    He, Ziqiang
    Liu, Kesheng
    Neural Networks, 2024, 173
  • [26] Self-attention Cross-modality Fusion Network for Cross-modality Person Re-identification
    Du P.
    Song Y.-H.
    Zhang X.-Y.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (06): : 1457 - 1468
  • [27] Cascaded Cross-Modality Fusion Network for 3D Object Detection
    Chen, Zhiyu
    Lin, Qiong
    Sun, Jing
    Feng, Yujian
    Liu, Shangdong
    Liu, Qiang
    Ji, Yimu
    Xu, He
    SENSORS, 2020, 20 (24) : 1 - 14
  • [28] Cross-modality fusion with EEG and text for enhanced emotion detection in English writing
    Wang, Jing
    Zhang, Ci
    FRONTIERS IN NEUROROBOTICS, 2025, 18
  • [29] GLSFF: Global-local specific feature fusion for cross-modality pedestrian re-identification
    Xue, Chen
    Deng, Zhongliang
    Wang, Shuo
    Hu, Enwen
    Zhang, Yao
    Yang, Wangwang
    Wang, Yiming
    COMPUTER COMMUNICATIONS, 2024, 215 : 157 - 168
  • [30] ADVERSARIAL LEARNING FRAMEWORK FOR CEREBROVASCULAR LANDMARK DETECTION USING CROSS-MODALITY INFORMATION
    Tan, Zimeng
    Feng, Jianjiang
    Lu, Wangsheng
    Yin, Yin
    Yang, Guangming
    Zhou, Jie
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,