Deep Multimodal Fusion Network for Semantic Segmentation Using Remote Sensing Image and LiDAR Data

被引:39
|
作者
Sun, Yangjie [1 ]
Fu, Zhongliang [1 ]
Sun, Chuanxia [2 ]
Hu, Yinglei [2 ]
Zhang, Shengyuan [1 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] Highway Adm Bur, Henan Transportat Dept, Zhengzhou 450046, Peoples R China
关键词
Semantics; Image segmentation; Laser radar; Sensors; Task analysis; Sun; Feature extraction; Aerial images; attention mechanism; convolutional neural network (CNN); multimodal fusion; semantic labeling; CLASSIFICATION; RGB;
D O I
10.1109/TGRS.2021.3108352
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Extracting semantic information from very-high-resolution (VHR) aerial images is a prominent topic in the Earth observation research. An increasing number of different sensor platforms are appearing in remote sensing, each of which can provide corresponding multimodal supplemental or enhanced information, such as optical images, light detection and ranging (LiDAR) point clouds, infrared images, or inertial measurement unit (IMU) data. However, these current deep networks for LiDAR and VHR images have not fully utilized the complete potential of multimodal data. The stacked multimodal fusion network (MFNet) ignores the structural differences between the modalities and the manual statistical characteristics within the modalities. For multimodal remote sensing data and its corresponding carefully designed handcrafted features, we designed a novel deep MFNet that can use multimodal VHR aerial images and LiDAR data and the corresponding intramodal features, such as LiDAR-derived features [slope and normalized digital surface model (NDSM)] and imagery-derived features [infrared-red-green (IRRG), normalized difference vegetation index (NDVI), and difference of Gaussian (DoG)]. Technically, we introduce the attention mechanism and multimodal learning to adaptively fuse intermodal and intramodal features. Specifically, we designed a multimodal fusion mechanism, pyramid dilation blocks, and a multilevel feature fusion module. Through these modules, our network realized the adaptive fusion of multimodal features, improved the receptive field, and enhanced the global-to-local contextual fusion effect. Moreover, we used a multiscale supervision training scheme to optimize the network. Extensive experimental results and ablation studies on the ISPRS semantic dataset and IEEE GRSS DFC Zeebrugge dataset show the effectiveness of our proposed MFNet.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] STAIR FUSION NETWORK FOR REMOTE SENSING IMAGE SEMANTIC SEGMENTATION
    Hua, Wenyi
    Liu, Jia
    Liu, Fang
    Zhang, Wenhua
    An, Jiaqi
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5499 - 5502
  • [2] AFNet: Adaptive Fusion Network for Remote Sensing Image Semantic Segmentation
    Liu, Rui
    Mi, Li
    Chen, Zhenzhong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (09): : 7871 - 7886
  • [3] A Crossmodal Multiscale Fusion Network for Semantic Segmentation of Remote Sensing Data
    Ma, Xianping
    Zhang, Xiaokang
    Pun, Man-On
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 3463 - 3474
  • [4] Deep multimodal fusion for semantic image segmentation: A survey
    Zhang, Yifei
    Sidibe, Desire
    Morel, Olivier
    Meriaudeau, Fabrice
    [J]. IMAGE AND VISION COMPUTING, 2021, 105
  • [5] A Multilevel Multimodal Fusion Transformer for Remote Sensing Semantic Segmentation
    Ma, Xianping
    Zhang, Xiaokang
    Pun, Man-On
    Liu, Ming
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [6] Semantic Segmentation of Remote Sensing Images Using Multiway Fusion Network
    Wu, Xiaosuo
    Wang, Liling
    Wu, Chaoyang
    Guo, Cunge
    Yan, Haowen
    Qiao, Ze
    [J]. SIGNAL PROCESSING, 2024, 215
  • [7] Multi-feature Map Pyramid Fusion Deep Network for Semantic Segmentation on Remote Sensing Data
    Zhao Fei
    Zhang Wenkai
    Yan Zhiyuan
    Yu Hongfeng
    Diao Wenhui
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2019, 41 (10) : 2525 - 2531
  • [8] Category-Wise Fusion and Enhancement Learning for Multimodal Remote Sensing Image Semantic Segmentation
    Zheng, Aihua
    He, Jinbo
    Wang, Ming
    Li, Chenglong
    Luo, Bin
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [9] Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion
    Zhang, Guangzhen
    Jiang, Wangyang
    [J]. INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2023, 19 (01)
  • [10] MFVNet: a deep adaptive fusion network with multiple field-of-views for remote sensing image semantic segmentation
    Yansheng LI
    Wei CHEN
    Xin HUANG
    Zhi GAO
    Siwei LI
    Tao HE
    Yongjun ZHANG
    [J]. Science China(Information Sciences), 2023, 66 (04) : 93 - 106