A Crossmodal Multiscale Fusion Network for Semantic Segmentation of Remote Sensing Data

被引:32
|
作者
Ma, Xianping [1 ]
Zhang, Xiaokang [1 ,2 ]
Pun, Man-On [1 ]
机构
[1] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Peoples R China
[2] Univ Sci & Technol China, Sch Math Sci, Hefei 230026, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Transformers; Remote sensing; Semantics; Image segmentation; Feature extraction; Fuses; Decoding; Combined squeeze-and-excitation (CSE); cross attention; crossmodal multiscale fusion; transformer;
D O I
10.1109/JSTARS.2022.3165005
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Driven by the rapid development of Earth observation sensors, semantic segmentation using multimodal fusion of remote sensing data has drawn substantial research attention in recent years. However, existing multimodal fusion methods based on convolutional neural networks cannot capture long-range dependencies across multiscale feature maps of remote sensing data in different modalities. To circumvent this problem, this work proposes a crossmodal multiscale fusion network (CMFNet) by exploiting the transformer architecture. In contrast to the conventional early, late, or hybrid fusion networks, the proposed CMFNet fuses information of different modalities at multiple scales using the cross-attention mechanism. More specifically, the CMFNet utilizes a novel cross-modal attention architecture to fuse multiscale convolutional feature maps of optical remote sensing images and digital surface model data through a crossmodal multiscale transformer (CMTrans) and a multiscale context augmented transformer (MCATrans). The CMTrans can effectively model long-range dependencies across multiscale feature maps derived from multimodal data, while the MCATrans can learn discriminative integrated representations for semantic segmentation. Extensive experiments on two large-scale fine-resolution remote sensing datasets, namely ISPRS Vaihingen and Potsdam, confirm the excellent performance of the proposed CMFNet as compared to other multimodal fusion methods.
引用
收藏
页码:3463 / 3474
页数:12
相关论文
共 50 条
  • [1] MFAFNet: A Multiscale Fully Attention Fusion Network for Remote Sensing Image Semantic Segmentation
    Dang, Yuanyuan
    Gao, Yu
    Liu, Bing
    [J]. IEEE ACCESS, 2024, 12 : 123388 - 123400
  • [2] MCAFNet: A Multiscale Channel Attention Fusion Network for Semantic Segmentation of Remote Sensing Images
    Yuan, Min
    Ren, Dingbang
    Feng, Qisheng
    Wang, Zhaobin
    Dong, Yongkang
    Lu, Fuxiang
    Wu, Xiaolin
    [J]. REMOTE SENSING, 2023, 15 (02)
  • [3] CMTFNet: CNN and Multiscale Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation
    Wu, Honglin
    Huang, Peng
    Zhang, Min
    Tang, Wenlong
    Yu, Xinyu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [4] MSFANet: Multiscale Fusion Attention Network for Road Segmentation of Multispectral Remote Sensing Data
    Tong, Zhonggui
    Li, Yuxia
    Zhang, Jinglin
    He, Lei
    Gong, Yushu
    [J]. REMOTE SENSING, 2023, 15 (08)
  • [5] Semantic Segmentation of Remote Sensing Images Using Multiscale Decoding Network
    Zhang, Xiaoqin
    Xiao, Zhiheng
    Li, Dongyang
    Fan, Mingyu
    Zhao, Li
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2019, 16 (09) : 1492 - 1496
  • [6] STAIR FUSION NETWORK FOR REMOTE SENSING IMAGE SEMANTIC SEGMENTATION
    Hua, Wenyi
    Liu, Jia
    Liu, Fang
    Zhang, Wenhua
    An, Jiaqi
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5499 - 5502
  • [7] TCNet: Multiscale Fusion of Transformer and CNN for Semantic Segmentation of Remote Sensing Images
    Xiang, Xuyang
    Gong, Wenping
    Li, Shuailong
    Chen, Jun
    Ren, Tianhe
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 3123 - 3136
  • [8] Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion
    Zhang, Guangzhen
    Jiang, Wangyang
    [J]. INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2023, 19 (01)
  • [9] Semisupervised Multiscale Generative Adversarial Network for Semantic Segmentation of Remote Sensing Image
    Wang, Jiaqi
    Liu, Bing
    Zhou, Yong
    Zhao, Jiaqi
    Xia, Shixiong
    Yang, Yuancan
    Zhang, Man
    Ming, Liu Ming
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [10] MDSNet: a multiscale decoupled supervision network for semantic segmentation of remote sensing images
    Feng, Jiangfan
    Chen, Panyu
    Gu, Zhujun
    Zeng, Maimai
    Zheng, Wei
    [J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2023, 16 (01) : 2844 - 2861