Scale-pyramid dynamic atrous convolution for pixel-level labeling

被引：1

作者：

Li, Zhiqiang ^{[1
,2
,3
]}

Jiang, Jie ^{[1
,3
]}

Chen, Xi ^{[1
,2
,3
]}

Zhang, Min ^{[7
]}

Wang, Yong ^{[4
]}

Li, Qingli ^{[2
]}

Qi, Honggang ^{[5
]}

Liu, Min ^{[1
,3
]}

Laganiere, Robert ^{[6
]}

机构：

[1] East China Normal Univ, Key Lab Geog Informat Sci, Minist Educ, Shanghai 200241, Peoples R China

[2] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China

[3] East China Normal Univ, Sch Geog Sci, Shanghai 200241, Peoples R China

[4] Sun Yat Sen Univ, Sch Aeronaut & Astronaut, Shenzhen 518107, Peoples R China

[5] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China

[6] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada

[7] Engn Univ PAP, Xian 710086, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 241卷

基金：

中国国家自然科学基金;

关键词：

Pixel-level labeling; Deep learning; DCNN; Dynamic convolution; Kernel engineering; NETWORK;

D O I：

10.1016/j.eswa.2023.122695

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For achieving better performance, the majority of deep convolutional neural networks have endeavored to increase the model capacity by adding more convolutional layers or increasing the size of the filters. Consequently, the computational cost increases proportionally with the model capacity. This problem can be alleviated by dynamic convolution. In the case of pixel-level labeling, existing pixel-level dynamic convolution methods have a smaller scanning area than ordinary convolution or image-level dynamic convolution and are thus unable to exploit fine contextual information. As a consequence, pixel-level dynamic convolution is more sensitive to large-scale varying objects and confusion categories. In this paper, we propose a scale-pyramid dynamic atrous convolution (SDAConv) and exploit multi-scale pixel-level features in finer granularity, in order to efficiently increase model capacity, exploring contextual information, capture detail information and alleviate large-scale variation problem at the same time. Through kernel engineering (instead of network engineering), SDAConv dynamically arranges atrous filters in the individual convolutional kernels over different semantic areas at dense scales in the spatial dimension. By simply replacing the regular convolution with SDAConv in SOTA architectures, extensive experiments on three public datasets, Cityscapes, PASCAL VOC 2012 and ADE20K benchmarks demonstrate the superior performance of SDAConv on pixel-level labeling tasks.

引用

页数：13

共 50 条

[41] Dynamic PVLC: Pixel-level Visible Light Communication Projector with Interactive Update of Images and Data
Hiraki, Takefumi
Fukushima, Shogo
Watase, Hiroshi
Naemura, Takeshi
ITE TRANSACTIONS ON MEDIA TECHNOLOGY AND APPLICATIONS, 2019, 7 (04): : 160 - 168
[42] Pixel-level bridge crack detection using a deep fusion about recurrent residual convolution and context encoder network
Li, Gang
Li, Xiyuan
Zhou, Jian
Liu, Dezhi
Ren, Wei
MEASUREMENT, 2021, 176
[43] Pixel-Level Sequential TDC With Wide Dynamic Range for Large-Array Microbolometer IRFPAs
Kim, Jongbeom
Woo, Doohyung
IEEE SENSORS JOURNAL, 2025, 25 (04) : 6572 - 6581
[44] Pixel-level mapping method in high dynamic range imaging system based on DMD modulation
Guan, Xiaomei
Qu, Xinghua
Niu, Bin
Zhang, Yuanjun
Zhang, Fumin
OPTICS COMMUNICATIONS, 2021, 499
[45] Multiscale Pixel-Level and Superpixel-Level Method for Hyperspectral Image Classification: Adaptive Attention and Parallel Multi-Hop Graph Convolution
Yin, Junru
Liu, Xuan
Hou, Ruixia
Chen, Qiqiang
Huang, Wei
Li, Aiguang
Wang, Peng
REMOTE SENSING, 2023, 15 (17)
[46] Pixel-level image fusion scheme based on steerable pyramid wavelet transform using absolute maximum selection fusion rule
Prakash, Om
Kumar, Arvind
Khare, Ashish
PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ISSUES AND CHALLENGES IN INTELLIGENT COMPUTING TECHNIQUES (ICICT), 2014, : 765 - 770
[47] Semantic Recognition of Human-Object Interactions via Gaussian-Based Elliptical Modeling and Pixel-Level Labeling
Khalid, Nida
Ghadi, Yazeed Yasin
Gochoo, Munkhjargal
Jalal, Ahmad
Kim, Kibum
IEEE ACCESS, 2021, 9 : 111249 - 111266
[48] DMFNet: geometric multi-scale pixel-level contrastive learning for video salient object detection
Singh, Hemraj
Verma, Mridula
Cheruku, Ramalingaswamy
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (02)
[49] Semantic Recognition of Human-Object Interactions via Gaussian-Based Elliptical Modeling and Pixel-Level Labeling
Khalid, Nida
Ghadi, Yazeed Yasin
Gochoo, Munkhjargal
Jalal, Ahmad
Kim, Kibum
IEEE Access, 2021, 9 : 111249 - 111266
[50] Pixel-level Diabetic Retinopathy Lesion Detection Using Multi-scale Convolutional Neural Network
Li, Qi
Peng, Chenglei
Ma, Yazhen
Du, Sidan
Guo, Bin
Li, Yang
2021 IEEE 3RD GLOBAL CONFERENCE ON LIFE SCIENCES AND TECHNOLOGIES (IEEE LIFETECH 2021), 2021, : 438 - 440

← 1 2 3 4 5 →