Research on Building Extraction based on Neural Network with Feature Enhancement and ELU Activation Function

被引：0

作者：

Tang Y. ^{[1
,2
,3
,4
]}

Liu Z. ^{[1
]}

Yang Y. ^{[1
]}

Gu H. ^{[1
]}

Yang S. ^{[2
,3
,4
]}

机构：

[1] Institute of Photogrammetry and Remote Sensing, Chinese Academy of Surveying and Mapping, Beijing

[2] Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou

[3] National-Local Joint Engineering Research Center of Technologies and Applications for National Geographic State Monitoring, Lanzhou

[4] Gansu Provincial Engineering Laboratory for National Geographic State Monitoring, Lanzhou

来源：

Journal of Geo-Information Science | 2021年 / 23卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Building extraction; Convolutional neural network; Deep learning; End-to-end; Exponential Linear Units (ELU); Feature enhancement; Feature Enhancement Network (FE-Net); High-resolution remote sensing image;

D O I：

10.12082/dqxxkx.2021.200130

中图分类号：

学科分类号：

摘要：

In recent years, with the rapid development of the city, a large number of people turn to work and live in the city, resulting in an increasing number of urban buildings. Land resources and urban ecological environment (such as green space) are threatened to some extent. Thus, it is urgent to plan urban land resources and space reasonably, prevent illegal construction, improve urban living environment, and make the city sustainable, orderly, healthy, and green. With the high-resolution remote sensing image data becoming more and more abundant, accurate building extraction using high-resolution remote sensing images plays an important role in urban planning, urban management, and change detection of urban buildings. Based on the U-Net network model, using the Massachusetts building dataset, this paper explored the network model structure and proposed a network model called FE-Net with "encoder-feature enhancement-decoder" structure and ELU activation function. First, the best basic network model called U-Net6 was found by comparing the building extraction results using U-Net5, U-Net6, and U-Net7 with different number of network layers. Based on the U-Net6, the network model of "U-Net6+ReLU+feature enhancement" was established by adding the structure of feature enhancement. In order to optimize the activation function, the ReLU activation function was replaced by the ELU activation function, and then the network model called FE-Net (U-Net6+ELU+feature enhancement) was created. The FE-Net network model was compared with the building extraction results from the other two network models (U-Net6+ReLU and U-Net6+ReLU+feature enhancement). Results show that the FE-Net network model had the best building extraction performance. Its relaxed F1-measure reached 97.23%, which was 0.36% and 0.12% higher than the other two network models. Meanwhile, FE-Net also had the highest extraction accuracy compared with other studies using the same dataset of Massachusetts. The FE-Net network model can extract multi-scale buildings better, which can not only extract small-scale buildings accurately, but also roughly and completely extract buildings with irregular shape with relatively less missing and wrong detections. Thus, the FE-Net network model can be used to achieve end-to-end building extraction with a high accuracy. © 2021, Science Press. All right reserved.

引用

页码：692 / 709

页数：17

共 30 条

[21] Liu Y H, Gross L, Li Z Q, Et al., Automatic building extraxtion on high-resolution remote sensing i-magery using deep convolutional encoder-decoder with spatial pyramid pooling, IEEE Access, 7, pp. 128774-128786, (2019)
[22] Yang J S, Mei T C, Zhong S D., Application of CNN considering local characteristics in remote sensing image classification, Computer Engineering and Application, 54, 7, pp. 188-195, (2018)
[23] Ronneberger O, Fischer P, Brox T., U-net: convolutional networks for biomedical image segmentation, International Conference on Medial Image Computing and Computer-Assisted Intervention, pp. 234-241, (2015)
[24] Clevert D A, Unterthiner T, Hochreiter S., Fast and accurate deep network learning by exponential linear units (elus), Computer Science, pp. 334-337, (2015)
[25] Nair V, Hinton G E., Rectified linear units improve restricted boltzmann machines, International Conference on Machine Learning, (2010)
[26] Zhou L C, Zhang C, Wu M., D-LinkNet: LinkNet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) IEEE, pp. 182-186, (2018)
[27] Wiedemann C, Heipke C, Mayer H., Empirical evaluation of automatically extracted road axes, Empirical Evaluation Techniques in Computer Vision, pp. 172-187, (1998)
[28] Marcu A, Costea D, Slusanschi E, Et al., A multi-stage multi-task neural network for aerial scene in-terpretation and geolocalization, (2018)
[29] Khalel A, El-Saban M., Automatic pixelwise object labeling for aerial imagery using stacked u-nets, (2018)
[30] Pan X, Yang F, Gao L, Et al., Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms, Remote Sensing, 11, 8, pp. 917-934, (2019)

← 1 2 3 →