Scale-pyramid dynamic atrous convolution for pixel-level labeling

被引:1
|
作者
Li, Zhiqiang [1 ,2 ,3 ]
Jiang, Jie [1 ,3 ]
Chen, Xi [1 ,2 ,3 ]
Zhang, Min [7 ]
Wang, Yong [4 ]
Li, Qingli [2 ]
Qi, Honggang [5 ]
Liu, Min [1 ,3 ]
Laganiere, Robert [6 ]
机构
[1] East China Normal Univ, Key Lab Geog Informat Sci, Minist Educ, Shanghai 200241, Peoples R China
[2] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China
[3] East China Normal Univ, Sch Geog Sci, Shanghai 200241, Peoples R China
[4] Sun Yat Sen Univ, Sch Aeronaut & Astronaut, Shenzhen 518107, Peoples R China
[5] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
[6] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
[7] Engn Univ PAP, Xian 710086, Peoples R China
基金
中国国家自然科学基金;
关键词
Pixel-level labeling; Deep learning; DCNN; Dynamic convolution; Kernel engineering; NETWORK;
D O I
10.1016/j.eswa.2023.122695
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For achieving better performance, the majority of deep convolutional neural networks have endeavored to increase the model capacity by adding more convolutional layers or increasing the size of the filters. Consequently, the computational cost increases proportionally with the model capacity. This problem can be alleviated by dynamic convolution. In the case of pixel-level labeling, existing pixel-level dynamic convolution methods have a smaller scanning area than ordinary convolution or image-level dynamic convolution and are thus unable to exploit fine contextual information. As a consequence, pixel-level dynamic convolution is more sensitive to large-scale varying objects and confusion categories. In this paper, we propose a scale-pyramid dynamic atrous convolution (SDAConv) and exploit multi-scale pixel-level features in finer granularity, in order to efficiently increase model capacity, exploring contextual information, capture detail information and alleviate large-scale variation problem at the same time. Through kernel engineering (instead of network engineering), SDAConv dynamically arranges atrous filters in the individual convolutional kernels over different semantic areas at dense scales in the spatial dimension. By simply replacing the regular convolution with SDAConv in SOTA architectures, extensive experiments on three public datasets, Cityscapes, PASCAL VOC 2012 and ADE20K benchmarks demonstrate the superior performance of SDAConv on pixel-level labeling tasks.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Scale-pyramid dynamic atrous convolution for pixel-level labeling
    Li, Zhiqiang
    Jiang, Jie
    Chen, Xi
    Zhang, Min
    Wang, Yong
    Li, Qingli
    Qi, Honggang
    Liu, Min
    Laganière, Robert
    Expert Systems with Applications, 2024, 241
  • [2] An Effective Hybrid Atrous Convolutional Network for Pixel-Level Crack Detection
    Chen, Hanshen
    Lin, Huiping
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [3] Pixel-Level Hardware Strategy for Large-Scale Convolution Calculation in Neuromorphic Devices
    Zhang, Xianghong
    Liu, Di
    Wu, Jianxin
    Cheng, Enping
    Qin, Congyao
    Gao, Changsong
    Shan, Liuting
    Zou, Yi
    Hu, Yuanyuan
    Guo, Tailiang
    Chen, Huipeng
    ADVANCED FUNCTIONAL MATERIALS, 2025, 35 (17)
  • [4] PIXEL-LEVEL GUIDED FACE EDITING WITH FULLY CONVOLUTION NETWORKS
    Li, Zhenxi
    Zhang, Juyong
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 307 - 312
  • [5] From Image-level to Pixel-level Labeling with Convolutional Networks
    Pinheiro, Pedro O.
    Collohert, Ronan
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 1713 - 1721
  • [6] Automated grasp labeling and detection framework with pixel-level precision
    Yang, Rui
    Li, Xixing
    Zhang, Yang
    Chen, Jiahao
    KNOWLEDGE-BASED SYSTEMS, 2024, 304
  • [7] Fast and Accurate Visual Tracking with Group Convolution and Pixel-Level Correlation
    Liu, Liduo
    Long, Yongji
    Li, Guoning
    Nie, Ting
    Zhang, Chengcheng
    He, Bin
    APPLIED SCIENCES-BASEL, 2023, 13 (17):
  • [8] Pixel-Level Encoding and Depth Layering for Instance-Level Semantic Labeling
    Uhrig, Jonas
    Cordts, Marius
    Franke, Uwe
    Brox, Thomas
    PATTERN RECOGNITION, GCPR 2016, 2016, 9796 : 14 - 25
  • [9] Pixel-Level Concrete Crack Segmentation Using Pyramidal Residual Network with Omni-Dimensional Dynamic Convolution
    Tan, Hao
    Dong, Shaojiang
    PROCESSES, 2023, 11 (02)
  • [10] Enhancing transferability of adversarial examples with pixel-level scale variation
    Mao, Zhongshu
    Lu, Yiqin
    Cheng, Zhe
    Shen, Xiong
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 118