Building Precision: Efficient Encoder-Decoder Networks for Remote Sensing Based on Aerial RGB and LiDAR Data

被引：0

作者：

Sulaiman, Muhammad ^{[1
]}

Finnesand, Erik ^{[1
]}

Farmanbar, Mina ^{[1
]}

Belbachir, Ahmed Nabil ^{[2
]}

Rong, Chunming ^{[1
,2
]}

机构：

[1] Univ Stavanger, Dept Elect Engn & Comp Sci, N-4021 Stavanger, Norway

[2] NORCE Norwegian Res Ctr, N-5008 Bergen, Norway

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Building precision; deep learning; LiDAR; remote sensing; semantic segmentation; U-Net; context-transfer U-Net; CONVOLUTIONAL NEURAL-NETWORK; EXTRACTION; IMAGES;

D O I：

10.1109/ACCESS.2024.3391416

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Precision in building delineation plays a pivotal role in population data analysis, city management, policy making, and disaster management. Leveraging computer vision technologies, particularly deep learning models for semantic segmentation, has proven instrumental in achieving accurate automatic building segmentation in remote sensing applications. However, current state-of-the-art (SOTA) techniques are not optimized for precisely extracting building footprints and, specifically, boundaries of the building. This deficiency highlights the need to leverage Light Detection and Ranging (LiDAR) data in conjunction with aerial RGB and streamlined deep learning for improved precision. This work utilizes the MapAI dataset, which includes a variety of objects beyond buildings, such as trees, electricity lines, solar panels, vehicles, and roads. These objects showcase diverse colors and structures, mirroring the rooftops in Denmark and Norway. Due to the aforementioned problems, this study modified UNet and CT-UNet to use LiDAR data and RGB images to segment buildings using Intersection Over Union (IoU) to evaluate building overlap and Boundary Intersection Over Union (BIoU) to evaluate precise building boundaries and shapes. The proposed work changes the configuration of these networks to streamline with LiDAR data for efficient segmentation. The batch data in training is augmented to improve model generalization and overcome overfitting. Batch normalization inclusion also improves overfitting. Four backbones with transfer learning are employed to enhance convergence and parameter efficiency of segmentation: ResNet50V2, DenseNet201, EfficientNetB4, and EfficientNetV2S. Test-Time Augmentation (TTA) is employed to improve the predicted mask. Experiments are performed using single and ensemble models, with and without Augmentation. The ensemble model outperforms the single model, and TTA also improves the results. LiDAR data with RGB improves the combined score (average of IoU and BIoU) by 13.33% compared to only RGB images.

引用

页码：60329 / 60346

页数：18

共 50 条

[1] Encoder-decoder structure based on conditional random field for building extraction in remote sensing images
Xu, Yian
[J]. EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2022, 9 (36):
[2] Detection of Building Change in Remote Sensing Image Based on Encoder-Decoder Network UNet3+
Liang, Yan
Yi, Chun-Xia
Wang, Guang-Yu
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (08): : 1720 - 1732
[3] Remote sensing image denoising using residual encoder-decoder networks with edge enhancement
结合残差编解码网络和边缘增强的遥感图像去噪
[J]. Zhan, Shu (shu_zhan@hfut.edu.cn), 1600, Science Press (24): : 27 - 36
[4] An road extraction method for remote sensing image based on Encoder-Decoder network
He, Hao
Wang, Shicheng
Yang, Dongfang
Wang, Shuyang
Liu, Xing
[J]. Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2019, 48 (03): : 330 - 338
[5] Data Prediction Based Encoder-Decoder Learning in Wireless Sensor Networks
Njoya, Arouna Ndam
Tchangmena, Allassan A. Nken
Ari, Ado Adamou Abba
Gueroui, Abdelhak
Thron, Christopher
Mpinda, Berthine Nyunga
Thiare, Ousmane
Tonye, Emmanuel
[J]. IEEE ACCESS, 2022, 10 : 109340 - 109356
[6] DENSIFICATION OF AIRBORNE LIDAR POINT CLOUD WITH FUSED ENCODER-DECODER NETWORKS
Wang, Weimin
Vinayaraj, Poliyapram
Nakamura, Ryosuke
[J]. IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 2655 - 2658
[7] Semantic Segmentation of Remote Sensing Image Based on Encoder-Decoder Convolutional Neural Network
Zhang Zhehan
Fang Wei
Du Lili
Qiao Yanli
Zhang Dongying
Ding Guoshen
[J]. ACTA OPTICA SINICA, 2020, 40 (03)
[8] Light encoder-decoder network for road extraction of remote sensing images
He, Hao
Yang, Dongfang
Wang, Shicheng
Zheng, Yuhang
Wang, Shuyang
[J]. JOURNAL OF APPLIED REMOTE SENSING, 2019, 13 (03)
[9] Urban building extraction based on information fusion-oriented deep encoder-decoder network in remote sensing imagery
Zhang, Cheng
Ma, Mingzhou
He, Dan
[J]. MULTIAGENT AND GRID SYSTEMS, 2022, 18 (3-4) : 279 - 294
[10] An Attention Encoder-Decoder Network Based on Generative Adversarial Network for Remote Sensing Image Dehazing
Zhao, Liquan
Zhang, Yupeng
Cui, Ying
[J]. IEEE SENSORS JOURNAL, 2022, 22 (11) : 10890 - 10900

← 1 2 3 4 5 →