Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation

被引：3

作者：

Li, Zhiqiang ^{[1
,2
,3
,4
]}

Jiang, Jie ^{[1
,2
]}

Chen, Xi ^{[1
,2
,3
,4
]}

Laganiere, Robert ^{[5
]}

Li, Qingli ^{[4
]}

Liu, Min ^{[1
,2
]}

Qi, Honggang ^{[6
]}

Wang, Yong ^{[7
]}

Zhang, Min ^{[8
]}

机构：

[1] East China Normal Univ, Sch Geog Sci, Shanghai 200241, Peoples R China

[2] East China Normal Univ, Minist Educ China, Key Lab Geog Informat Sci, Shanghai 200241, Peoples R China

[3] East China Normal Univ, Key Lab Spatial Temporal Big Data Anal & Applicat, Minist Nat Resources, Shanghai 200241, Peoples R China

[4] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China

[5] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada

[6] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China

[7] Sun Yat Sen Univ, Sch Aeronaut & Astronaut, Guangzhou 518107, Peoples R China

[8] Engn Univ PAP, Xian 710086, Peoples R China

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 22期

基金：

中国国家自然科学基金;

关键词：

Semantic segmentation; Deep learning; Deep convolution neural networks (DCNNs); Dynamic convolution;

D O I：

10.1007/s10489-023-04935-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep convolution neural networks (DCNNs) in deep learning have been widely used in semantic segmentation. However, the filters of most regular convolutions in DCNNs are spatially invariant to local transformations, which reduces localization accuracy and hinders the improvement of semantic segmentation. Dynamic convolution with pixel-level filters can enhance the localization accuracy through its region-awareness, but these are sensitive to objects with large-scale variations in semantic segmentation. To simultaneously address the low localization accuracy and objects with large-scale variations, we propose a filter-varying atrous convolution (FAC) to efficiently enlarge the per-pixel receptive fields pertaining to various objects. FAC mainly consists of a conditional-filter-generating network (CFGN) and a dynamic local filtering operation (DLFO). In the CFGN, a class probability map is used to generate the corresponding filters, making the FAC genuinely dynamic. In the DLFO, by replacing the sliding convolution operation one by one with a one-time dot product operation, the efficiency of the algorithm is greatly improved. Also, a dense scale module (DSM) is constructed to generate denser scales and larger receptive fields for exploring long-range contextual information. Finally, a dense-scale dynamic network (DsDNet) simultaneously enhances the localization accuracy and reduces the effect of large-scale variations of the object, by assigning FAC to different spatial locations at dense scales. In addition, to accelerate network convergence and improve segmentation accuracy, our network employs two pixel-wise cross-entropy loss functions. One is between the Backbone and DSM, and the other is at the network's end. Extensive experiments on Cityscapes, PASCAL VOC 2012, and ADE20K datasets verify that the performance of our DsDNet is superior to the non-dynamic and multi-scale convolution neural networks.

引用

页码：26810 / 26826

页数：17

共 50 条

[1] Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation
Zhiqiang Li
Jie Jiang
Xi Chen
Robert Laganière
Qingli Li
Min Liu
Honggang Qi
Yong Wang
Min Zhang
[J]. Applied Intelligence, 2023, 53 : 26810 - 26826
[2] An enhancement model based on dense atrous and inception convolution for image semantic segmentation
Erjing Zhou
Xiang Xu
Baomin Xu
Hongwei Wu
[J]. Applied Intelligence, 2023, 53 : 5519 - 5531
[3] An enhancement model based on dense atrous and inception convolution for image semantic segmentation
Zhou, Erjing
Xu, Xiang
Xu, Baomin
Wu, Hongwei
[J]. APPLIED INTELLIGENCE, 2023, 53 (05) : 5519 - 5531
[4] Semantic Segmentation of Tennis Scene Based on Series Atrous Convolution Neural Network
Li Y.
Zhang Y.
He Z.
[J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (04): : 606 - 615
[5] ATROUS CONVOLUTION FOR BINARY SEMANTIC SEGMENTATION OF LUNG NODULE
Hesamian, Mohammad Hesam
Jia, Wenjing
He, Xiangjian
Kennedy, Paul J.
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1015 - 1019
[6] DENSE CONVOLUTION FOR SEMANTIC SEGMENTATION
Han, Chaoyi
Tao, Xiaoming
Duan, Yiping
Lu, Jianhua
[J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2222 - 2226
[7] Biomedical image segmentation algorithm based on dense atrous convolution
Li, Hong'an
Liu, Man
Fan, Jiangwen
Liu, Qingfang
[J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2024, 21 (03) : 4351 - 4369
[8] Filling the Gaps in Atrous Convolution: Semantic Segmentation With a Better Context
Liu, Liyuan
Pang, Yanwei
Zamir, Syed Waqas
Khan, Salman
Khan, Fahad Shahbaz
Shao, Ling
[J]. IEEE ACCESS, 2020, 8 : 34019 - 34028
[9] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Chen, Liang-Chieh
Zhu, Yukun
Papandreou, George
Schroff, Florian
Adam, Hartwig
[J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
[10] Multi-Scale Aggregation Stereo Matching Network Based on Dense Grouping Atrous Convolution
Zou, Qijie
Zhang, Jie
Chen, Shuang
Gao, Bing
Qin, Jing
Dong, Aotian
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (12):

← 1 2 3 4 5 →