Cross-modality interaction for few-shot multispectral object detection with semantic knowledge

被引:1
|
作者
Huang, Lian [1 ]
Peng, Zongju [1 ]
Chen, Fen [1 ]
Dai, Shaosheng [2 ]
He, Ziqiang [2 ]
Liu, Kesheng [2 ]
机构
[1] Chongqing Univ Technol, Sch Elect & Elect Engn, Chongqing 400054, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Sch Commun & Informat Engn, Chongqing 400065, Peoples R China
关键词
Few-shot learning; Object detection; Metric learning; Semantic knowledge; NETWORK;
D O I
10.1016/j.neunet.2024.106156
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multispectral object detection (MOD), which incorporates additional information from thermal images into object detection (OD) to robustly cope with complex illumination conditions, has garnered significant attention. However, existing MOD methods always demand a considerable amount of annotated data for training. Inspired by the concept of few -shot learning, we propose a novel task called few -shot multispectral object detection (FSMOD) that aims to accomplish MOD using only a few annotated data from each category. Specifically, we first design a cross -modality interaction (CMI) module, which leverages different attention mechanisms to interact with the information from visible and thermal modalities during backbone feature extraction. With the guidance of interaction process, the detector is able to extract modality -specific backbone features with better discrimination. To improve the few -shot learning ability of the detector, we also design a semantic prototype metric (SPM) loss that integrates semantic knowledge, i.e., word embeddings, into the optimization process of embedding space. Semantic knowledge provides stable category representation when visual information is insufficient. Extensive experiments on the customized FSMOD dataset demonstrate that the proposed method achieves state-of-the-art performance.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Cross-modality interaction for few-shot multispectral object detection with semantic knowledge
    Huang, Lian
    Peng, Zongju
    Chen, Fen
    Dai, Shaosheng
    He, Ziqiang
    Liu, Kesheng
    Neural Networks, 2024, 173
  • [2] Multiple knowledge embedding for few-shot object detection
    Gong, Xiaolin
    Cai, Youpeng
    Wang, Jian
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (05) : 2231 - 2240
  • [3] Few-Shot Object Detection via Knowledge Transfer
    Kim, Geonuk
    Jung, Hong-Gyu
    Lee, Seong-Whan
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3564 - 3569
  • [4] Multiple knowledge embedding for few-shot object detection
    Xiaolin Gong
    Youpeng Cai
    Jian Wang
    Signal, Image and Video Processing, 2023, 17 : 2231 - 2240
  • [5] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
    Zhu, Chenchen
    Chen, Fangyi
    Ahmed, Uzair
    Shen, Zhiqiang
    Savvides, Marios
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8778 - 8787
  • [6] Learning transferable cross-modality representations for few-shot hyperspectral and LiDAR collaborative classification
    Dai, Mofan
    Xing, Shuai
    Xu, Qing
    Wang, Hanyun
    Li, Pengcheng
    Sun, Yifan
    Pan, Jiechen
    Li, Yuqiong
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 126
  • [7] Meta-hallucinator: Towards Few-Shot Cross-Modality Cardiac Image Segmentation
    Zhao, Ziyuan
    Zhou, Fangcheng
    Zeng, Zeng
    Guan, Cuntai
    Zhou, S. Kevin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 : 128 - 139
  • [8] Few-shot object detection with semantic enhancement and semantic prototype contrastive learning
    Huang, Lian
    Dai, Shaosheng
    He, Ziqiang
    KNOWLEDGE-BASED SYSTEMS, 2022, 252
  • [9] Incremental few-shot object detection via knowledge transfer
    Feng, Hangtao
    Zhang, Lu
    Yang, Xu
    Liu, Zhiyong
    PATTERN RECOGNITION LETTERS, 2022, 156 : 67 - 73
  • [10] Exploring Effective Knowledge Transfer for Few-shot Object Detection
    Zhao, Zhiyuan
    Liu, Qingjie
    Wang, Yunhong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6831 - 6839