Multi-label remote sensing classification with self-supervised gated multi-modal transformers

被引:1
|
作者
Liu, Na [1 ]
Yuan, Ye [1 ]
Wu, Guodong [2 ]
Zhang, Sai [2 ]
Leng, Jie [2 ]
Wan, Lihong [2 ]
机构
[1] Univ Shanghai Sci & Technol, Inst Machine Intelligence, Shanghai, Peoples R China
[2] Origin Dynam Intelligent Robot Co Ltd, Zhengzhou, Peoples R China
关键词
self-supervised learning; pre-training; vision transformer; multi-modal; gated units; BENCHMARK-ARCHIVE; LARGE-SCALE; BIGEARTHNET;
D O I
10.3389/fncom.2024.1404623
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Introduction With the great success of Transformers in the field of machine learning, it is also gradually attracting widespread interest in the field of remote sensing (RS). However, the research in the field of remote sensing has been hampered by the lack of large labeled data sets and the inconsistency of data modes caused by the diversity of RS platforms. With the rise of self-supervised learning (SSL) algorithms in recent years, RS researchers began to pay attention to the application of "pre-training and fine-tuning" paradigm in RS. However, there are few researches on multi-modal data fusion in remote sensing field. Most of them choose to use only one of the modal data or simply splice multiple modal data roughly.Method In order to study a more efficient multi-modal data fusion scheme, we propose a multi-modal fusion mechanism based on gated unit control (MGSViT). In this paper, we pretrain the ViT model based on BigEarthNet dataset by combining two commonly used SSL algorithms, and propose an intra-modal and inter-modal gated fusion unit for feature learning by combining multispectral (MS) and synthetic aperture radar (SAR). Our method can effectively combine different modal data to extract key feature information.Results and discussion After fine-tuning and comparison experiments, we outperform the most advanced algorithms in all downstream classification tasks. The validity of our proposed method is verified.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] TRANSFORMER-BASED MULTI-MODAL LEARNING FOR MULTI-LABEL REMOTE SENSING IMAGE CLASSIFICATION
    Hoffmann, David Sebastian
    Clasen, Kai Norman
    Demir, Begum
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4891 - 4894
  • [2] A self-supervised building extraction method based on multi-modal remote sensing data
    Qu, Yunhao
    Wang, Chang
    REMOTE SENSING LETTERS, 2025, 16 (01) : 77 - 88
  • [3] JOINT MULTI-MODAL SELF-SUPERVISED PRE-TRAINING IN REMOTE SENSING: APPLICATION TO METHANE SOURCE CLASSIFICATION
    Berg, Paul
    Pham, Minh-Tan
    Courty, Nicolas
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6624 - 6627
  • [4] Heterogeneous self-supervised interest point matching for multi-modal remote sensing image registration
    Zhao, Ming
    Zhang, Guixiang
    Ding, Min
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (03) : 915 - 931
  • [5] Self-Supervised Multi-Label Classification with Global Context and Local Attention
    Chen, Chun-Yen
    Yeh, Mei-Chen
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 934 - 942
  • [6] ACTIVE LEARNING GUIDED FINE-TUNING FOR ENHANCING SELF-SUPERVISED BASED MULTI-LABEL CLASSIFICATION OF REMOTE SENSING IMAGES
    Moellenbrok, Lars
    Demir, Beguem
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4986 - 4989
  • [7] Multi-Label Self-Supervised Learning with Scene Images
    Zhu, Ke
    Fu, Minghao
    Wu, Jianxin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6671 - 6680
  • [8] Multi-label enhancement based self-supervised deep cross-modal hashing
    Zou, Xitao
    Wu, Song
    Bakker, Erwin M.
    Wang, Xinzhi
    Neurocomputing, 2022, 467 : 138 - 162
  • [9] Multi-label enhancement based self-supervised deep cross-modal hashing
    Zou, Xitao
    Wu, Song
    Bakker, Erwin M.
    Wang, Xinzhi
    NEUROCOMPUTING, 2022, 467 : 138 - 162
  • [10] The Effectiveness of Self-supervised Pre-training for Multi-modal Endometriosis Classification
    Butler, David
    Wang, Hu
    Zhang, Yuan
    To, Minh-Son
    Condous, George
    Leonardi, Mathew
    Knox, Steven
    Avery, Jodie
    Hull, M. Louise
    Carneiro, Gustavo
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,