Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification

被引:4
|
作者
Pan, Wenjie [1 ]
Huang, Linhan [1 ]
Liang, Jianbao [1 ]
Hong, Lan [1 ]
Zhu, Jianqing [1 ,2 ]
机构
[1] Huaqiao Univ, Coll Engn, Quanzhou 362021, Peoples R China
[2] Xiamen Yealink Network Technol Co Ltd, 666 Huan Rd, High Tech Pk, Xiamen 361015, Peoples R China
基金
中国国家自然科学基金;
关键词
multi-modal image; transformer; vehicle re-identification;
D O I
10.3390/s23094206
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Multi-modal (i.e., visible, near-infrared, and thermal-infrared) vehicle re-identification has good potential to search vehicles of interest in low illumination. However, due to the fact that different modalities have varying imaging characteristics, a proper multi-modal complementary information fusion is crucial to multi-modal vehicle re-identification. For that, this paper proposes a progressively hybrid transformer (PHT). The PHT method consists of two aspects: random hybrid augmentation (RHA) and a feature hybrid mechanism (FHM). Regarding RHA, an image random cropper and a local region hybrider are designed. The image random cropper simultaneously crops multi-modal images of random positions, random numbers, random sizes, and random aspect ratios to generate local regions. The local region hybrider fuses the cropped regions to let regions of each modal bring local structural characteristics of all modalities, mitigating modal differences at the beginning of feature learning. Regarding the FHM, a modal-specific controller and a modal information embedding are designed to effectively fuse multi-modal information at the feature level. Experimental results show the proposed method wins the state-of-the-art method by a larger 2.7% mAP on RGBNT100 and a larger 6.6% mAP on RGBN300, demonstrating that the proposed method can learn multi-modal complementary information effectively.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] H-ViT: Hybrid Vision Transformer for Multi-modal Vehicle Re-identification
    Pan, Wenjie
    Wu, Hanxiao
    Zhu, Jianqing
    Zeng, Huanqiang
    Zhu, Xiaobin
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 255 - 267
  • [2] Multi-modal person re-identification based on transformer relational regularization
    Zheng, Xiangtian
    Huang, Xiaohua
    Ji, Chen
    Yang, Xiaolin
    Sha, Pengcheng
    Cheng, Liang
    INFORMATION FUSION, 2024, 103
  • [3] MULTI-MODAL METRIC LEARNING FOR VEHICLE RE-IDENTIFICATION IN TRAFFIC SURVEILLANCE ENVIRONMENT
    Tang, Yi
    Wu, Di
    Jin, Zhi
    Zou, Wenbin
    Li, Xia
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 2254 - 2258
  • [4] MambaReID: Exploiting Vision Mamba for Multi-Modal Object Re-Identification
    Zhang, Ruijuan
    Xu, Lizhong
    Yang, Song
    Wang, Li
    SENSORS, 2024, 24 (14)
  • [5] Multi-Modal Context Propagation for Person Re-Identification With Wireless Positioning
    Liu, Yiheng
    Zhou, Wengang
    Xi, Mao
    Shen, Sanjing
    Li, Houqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 3060 - 3073
  • [6] Parametric Local Multi-Modal metric learning for Person Re-identification
    Liu, Kai
    Zhao, Zhicheng
    Cai, Anni
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2578 - 2583
  • [7] Multi-attribute adaptive aggregation transformer for vehicle re-identification
    Yu, Zhi
    Pei, Jiaming
    Zhu, Mingpeng
    Zhang, Jiwei
    Li, Jinhai
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (02)
  • [8] Heterogeneous Test-Time Training for Multi-Modal Person Re-identification
    Wang, Zi
    Huang, Huaibo
    Zheng, Aihua
    He, Ran
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5850 - 5858
  • [9] GiT: Graph Interactive Transformer for Vehicle Re-Identification
    Shen, Fei
    Xie, Yi
    Zhu, Jianqing
    Zhu, Xiaobin
    Zeng, Huanqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1039 - 1051
  • [10] Multi-modal uniform deep learning for RGB-D person re-identification
    Ren, Liangliang
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    PATTERN RECOGNITION, 2017, 72 : 446 - 457