Building Precision: Efficient Encoder-Decoder Networks for Remote Sensing Based on Aerial RGB and LiDAR Data

被引:0
|
作者
Sulaiman, Muhammad [1 ]
Finnesand, Erik [1 ]
Farmanbar, Mina [1 ]
Belbachir, Ahmed Nabil [2 ]
Rong, Chunming [1 ,2 ]
机构
[1] Univ Stavanger, Dept Elect Engn & Comp Sci, N-4021 Stavanger, Norway
[2] NORCE Norwegian Res Ctr, N-5008 Bergen, Norway
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Building precision; deep learning; LiDAR; remote sensing; semantic segmentation; U-Net; context-transfer U-Net; CONVOLUTIONAL NEURAL-NETWORK; EXTRACTION; IMAGES;
D O I
10.1109/ACCESS.2024.3391416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Precision in building delineation plays a pivotal role in population data analysis, city management, policy making, and disaster management. Leveraging computer vision technologies, particularly deep learning models for semantic segmentation, has proven instrumental in achieving accurate automatic building segmentation in remote sensing applications. However, current state-of-the-art (SOTA) techniques are not optimized for precisely extracting building footprints and, specifically, boundaries of the building. This deficiency highlights the need to leverage Light Detection and Ranging (LiDAR) data in conjunction with aerial RGB and streamlined deep learning for improved precision. This work utilizes the MapAI dataset, which includes a variety of objects beyond buildings, such as trees, electricity lines, solar panels, vehicles, and roads. These objects showcase diverse colors and structures, mirroring the rooftops in Denmark and Norway. Due to the aforementioned problems, this study modified UNet and CT-UNet to use LiDAR data and RGB images to segment buildings using Intersection Over Union (IoU) to evaluate building overlap and Boundary Intersection Over Union (BIoU) to evaluate precise building boundaries and shapes. The proposed work changes the configuration of these networks to streamline with LiDAR data for efficient segmentation. The batch data in training is augmented to improve model generalization and overcome overfitting. Batch normalization inclusion also improves overfitting. Four backbones with transfer learning are employed to enhance convergence and parameter efficiency of segmentation: ResNet50V2, DenseNet201, EfficientNetB4, and EfficientNetV2S. Test-Time Augmentation (TTA) is employed to improve the predicted mask. Experiments are performed using single and ensemble models, with and without Augmentation. The ensemble model outperforms the single model, and TTA also improves the results. LiDAR data with RGB improves the combined score (average of IoU and BIoU) by 13.33% compared to only RGB images.
引用
收藏
页码:60329 / 60346
页数:18
相关论文
共 50 条
  • [41] HeteroNet: a heterogeneous encoder-decoder network for sea-land segmentation of remote sensing images
    Ji, Xun
    Tang, Longbin
    Liu, Tianhe
    Guo, Hui
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (05)
  • [42] Automatic 3D Multiple Building Change Detection Model Based on Encoder-Decoder Network Using Highly Unbalanced Remote Sensing Datasets
    Gomroki, Masoomeh
    Hasanlou, Mahdi
    Chanussot, Jocelyn
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 10311 - 10325
  • [43] Cascaded deep convolutional encoder-decoder neural networks for efficient liver tumor segmentation
    Budak, Umit
    Guo, Yanhui
    Tanyildizi, Erkan
    Sengur, Abdulkadir
    [J]. MEDICAL HYPOTHESES, 2020, 134
  • [44] Deep convolutional encoder-decoder networks based on ensemble learning for semantic segmentation of high-resolution aerial imagery
    Zhu, Huming
    Liu, Chendi
    Li, Qiuming
    Zhang, Lingyun
    Wang, Libing
    Li, Sifan
    Jiao, Licheng
    Hou, Biao
    [J]. CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2024, 6 (04) : 408 - 424
  • [45] Automatic Building Extraction on High-Resolution Remote Sensing Imagery Using Deep Convolutional Encoder-Decoder With Spatial Pyramid Pooling
    Liu, Yaohui
    Gross, Lutz
    Li, Zhiqiang
    Li, Xiaoli
    Fan, Xiwei
    Qi, Wenhua
    [J]. IEEE ACCESS, 2019, 7 : 128774 - 128786
  • [46] Efficient Channel Attention Based Encoder-Decoder Approach for Image Captioning in Hindi
    Mishra, Santosh Kumar
    Rai, Gaurav
    Saha, Sriparna
    Bhattacharyya, Pushpak
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)
  • [47] Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks
    Cho, Kyunghyun
    Courville, Aaron
    Bengio, Yoshua
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1875 - 1886
  • [48] Extracting Buildings from Remote Sensing Images Using a Multitask Encoder-Decoder Network with Boundary Refinement
    Xu, Hao
    Zhu, Panpan
    Luo, Xiaobo
    Xie, Tianshou
    Zhang, Liqiang
    [J]. REMOTE SENSING, 2022, 14 (03)
  • [49] Image restoration of finger-vein networks based on encoder-decoder model
    Guo, Xiao-jing
    Li, Dan
    Zhang, Hai-gang
    Yang, Jin-feng
    [J]. OPTOELECTRONICS LETTERS, 2019, 15 (06) : 463 - 467
  • [50] An anomaly detection method based on double encoder-decoder generative adversarial networks
    Liu, Hui
    Tang, Tinglong
    Luo, Jake
    Zhao, Meng
    Zheng, Baole
    Wu, Yirong
    [J]. INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2021, 48 (05): : 643 - 648