MCFT: Multimodal Contrastive Fusion Transformer for Classification of Hyperspectral Image and LiDAR Data

被引:0
|
作者
Feng, Yining [1 ]
Jin, Jiarui [2 ]
Yin, Yin [2 ]
Song, Chuanming [3 ]
Wang, Xianghai [1 ,2 ]
机构
[1] Liaoning Normal Univ, Sch Geog, Dalian 116029, Peoples R China
[2] Liaoning Normal Univ, Sch Comp Sci & Artificial Intelligence, Dalian 116029, Peoples R China
[3] Dalian Univ, Sch Informat Engn, Dalian 116622, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Laser radar; Data mining; Convolutional neural networks; Computer vision; Accuracy; Head; Electronic mail; Data models; Contrastive learning; deep learning (DL); feature alignment; feature matching; HS-LiDAR fusion and classification; vision transformer (ViT);
D O I
10.1109/TGRS.2024.3490752
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Multisource remote sensing (RS) image fusion leverages data from various sensors to enhance the accuracy and comprehensiveness of Earth observation. Notably, the fusion of hyperspectral (HS) images and light detection and ranging (LiDAR) data has garnered significant attention due to their complementary features. However, current methods predominantly rely on simplistic techniques such as weight sharing, feature superposition, or feature products, which often fall short of achieving true feature fusion. These methods primarily focus on feature accumulation rather than integrative fusion. The transformer framework, with its self-attention mechanisms, offers potential for effective multimodal data fusion. However, simple linear transformations used in feature extraction may not adequately capture all relevant information. To address these challenges, we propose a novel multimodal contrastive fusion transformer (MCFT). Our approach employs convolutional neural networks (CNNs) for feature extraction from different modalities and leverages transformer networks for advanced fusion. We have modified the basic transformer architecture and propose a double position embedding mode to make it more suitable for RS image processing tasks. We introduce two novel modules: feature alignment module and feature matching module, designed to exploit both paired and unpaired samples. These modules facilitate more effective cross-modal learning by emphasizing the commonalities within the same features and the differences between features from distinct modalities. Experimental evaluations on several publicly available HS-LiDAR datasets demonstrate that proposed method consistently outperforms existing advanced methods. The source code for our approach is available at: https://github.com/SYFYN0317/MCFT.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A Contrastive Learning Enhanced Adaptive Multimodal Fusion Network for Hyperspectral and LiDAR Data Classification
    Xu, Kai
    Wang, Bangjun
    Zhu, Zhou
    Jia, Zhaohong
    Fan, Chengcheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [2] Classification Based on Hyperspectral Image and LiDAR Data with Contrastive Learning
    Li Shihan
    Hua Haiyang
    Zhang Hao
    LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (22)
  • [3] Multimodal Transformer Network for Hyperspectral and LiDAR Classification
    Zhang, Yiyan
    Xu, Shufang
    Hong, Danfeng
    Gao, Hongmin
    Zhang, Chenkai
    Bi, Meiqiao
    Li, Chenming
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [4] Multiscale Attention Feature Fusion Based on Improved Transformer for Hyperspectral Image and LiDAR Data Classification
    Wang, Aili
    Lei, Guilong
    Dai, Shiyu
    Wu, Haibin
    Iwahori, Yuji
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 4124 - 4140
  • [5] Interactive transformer and CNN network for fusion classification of hyperspectral and LiDAR data
    Wang, Leiquan
    Liu, Wenwen
    Lyu, Dong
    Zhang, Peiying
    Guo, Fangming
    Hu, Yabin
    Xu, Mingming
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2024,
  • [6] Modality Fusion Vision Transformer for Hyperspectral and LiDAR Data Collaborative Classification
    Yang, Bin
    Wang, Xuan
    Xing, Ying
    Cheng, Chen
    Jiang, Weiwei
    Feng, Quanlong
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 17052 - 17065
  • [7] Contrastive Learning Based on Transformer for Hyperspectral Image Classification
    Hu, Xiang
    Li, Teng
    Zhou, Tong
    Liu, Yu
    Peng, Yuanxi
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [8] Vision Transformer With Contrastive Learning for Hyperspectral Image Classification
    Zhou, Heng
    Zhang, Xin
    Zhang, Chunlei
    Ma, Qiaoyu
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [9] A novel graph-attention based multimodal fusion network for joint classification of hyperspectral image and LiDAR data
    Cai, Jianghui
    Zhang, Min
    Yang, Haifeng
    He, Yanting
    Yang, Yuqing
    Shi, Chenhui
    Zhao, Xujun
    Xun, Yaling
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [10] CLASSIFICATION OF CLOUDY HYPERSPECTRAL IMAGE AND LIDAR DATA BASED ON FEATURE FUSION AND DECISION FUSION
    Luo, Renbo
    Liao, Wenzhi
    Zhang, Hongyan
    Pi, Youguo
    Philips, Wilfried
    2016 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2016, : 2518 - 2521