Scale-pyramid dynamic atrous convolution for pixel-level labeling

被引：1

作者：

Li, Zhiqiang ^{[1
,2
,3
]}

Jiang, Jie ^{[1
,3
]}

Chen, Xi ^{[1
,2
,3
]}

Zhang, Min ^{[7
]}

Wang, Yong ^{[4
]}

Li, Qingli ^{[2
]}

Qi, Honggang ^{[5
]}

Liu, Min ^{[1
,3
]}

Laganiere, Robert ^{[6
]}

机构：

[1] East China Normal Univ, Key Lab Geog Informat Sci, Minist Educ, Shanghai 200241, Peoples R China

[2] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China

[3] East China Normal Univ, Sch Geog Sci, Shanghai 200241, Peoples R China

[4] Sun Yat Sen Univ, Sch Aeronaut & Astronaut, Shenzhen 518107, Peoples R China

[5] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China

[6] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada

[7] Engn Univ PAP, Xian 710086, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 241卷

基金：

中国国家自然科学基金;

关键词：

Pixel-level labeling; Deep learning; DCNN; Dynamic convolution; Kernel engineering; NETWORK;

D O I：

10.1016/j.eswa.2023.122695

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For achieving better performance, the majority of deep convolutional neural networks have endeavored to increase the model capacity by adding more convolutional layers or increasing the size of the filters. Consequently, the computational cost increases proportionally with the model capacity. This problem can be alleviated by dynamic convolution. In the case of pixel-level labeling, existing pixel-level dynamic convolution methods have a smaller scanning area than ordinary convolution or image-level dynamic convolution and are thus unable to exploit fine contextual information. As a consequence, pixel-level dynamic convolution is more sensitive to large-scale varying objects and confusion categories. In this paper, we propose a scale-pyramid dynamic atrous convolution (SDAConv) and exploit multi-scale pixel-level features in finer granularity, in order to efficiently increase model capacity, exploring contextual information, capture detail information and alleviate large-scale variation problem at the same time. Through kernel engineering (instead of network engineering), SDAConv dynamically arranges atrous filters in the individual convolutional kernels over different semantic areas at dense scales in the spatial dimension. By simply replacing the regular convolution with SDAConv in SOTA architectures, extensive experiments on three public datasets, Cityscapes, PASCAL VOC 2012 and ADE20K benchmarks demonstrate the superior performance of SDAConv on pixel-level labeling tasks.

引用

页数：13

共 50 条

[1] Scale-pyramid dynamic atrous convolution for pixel-level labeling
Li, Zhiqiang
Jiang, Jie
Chen, Xi
Zhang, Min
Wang, Yong
Li, Qingli
Qi, Honggang
Liu, Min
Laganière, Robert
Expert Systems with Applications, 2024, 241
[2] An Effective Hybrid Atrous Convolutional Network for Pixel-Level Crack Detection
Chen, Hanshen
Lin, Huiping
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
[3] Pixel-Level Hardware Strategy for Large-Scale Convolution Calculation in Neuromorphic Devices
Zhang, Xianghong
Liu, Di
Wu, Jianxin
Cheng, Enping
Qin, Congyao
Gao, Changsong
Shan, Liuting
Zou, Yi
Hu, Yuanyuan
Guo, Tailiang
Chen, Huipeng
ADVANCED FUNCTIONAL MATERIALS, 2025, 35 (17)
[4] PIXEL-LEVEL GUIDED FACE EDITING WITH FULLY CONVOLUTION NETWORKS
Li, Zhenxi
Zhang, Juyong
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 307 - 312
[5] From Image-level to Pixel-level Labeling with Convolutional Networks
Pinheiro, Pedro O.
Collohert, Ronan
2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 1713 - 1721
[6] Automated grasp labeling and detection framework with pixel-level precision
Yang, Rui
Li, Xixing
Zhang, Yang
Chen, Jiahao
KNOWLEDGE-BASED SYSTEMS, 2024, 304
[7] Fast and Accurate Visual Tracking with Group Convolution and Pixel-Level Correlation
Liu, Liduo
Long, Yongji
Li, Guoning
Nie, Ting
Zhang, Chengcheng
He, Bin
APPLIED SCIENCES-BASEL, 2023, 13 (17):
[8] Pixel-Level Encoding and Depth Layering for Instance-Level Semantic Labeling
Uhrig, Jonas
Cordts, Marius
Franke, Uwe
Brox, Thomas
PATTERN RECOGNITION, GCPR 2016, 2016, 9796 : 14 - 25
[9] Pixel-Level Concrete Crack Segmentation Using Pyramidal Residual Network with Omni-Dimensional Dynamic Convolution
Tan, Hao
Dong, Shaojiang
PROCESSES, 2023, 11 (02)
[10] Enhancing transferability of adversarial examples with pixel-level scale variation
Mao, Zhongshu
Lu, Yiqin
Cheng, Zhe
Shen, Xiong
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 118

← 1 2 3 4 5 →