Scale-pyramid dynamic atrous convolution for pixel-level labeling

被引:1
|
作者
Li, Zhiqiang [1 ,2 ,3 ]
Jiang, Jie [1 ,3 ]
Chen, Xi [1 ,2 ,3 ]
Zhang, Min [7 ]
Wang, Yong [4 ]
Li, Qingli [2 ]
Qi, Honggang [5 ]
Liu, Min [1 ,3 ]
Laganiere, Robert [6 ]
机构
[1] East China Normal Univ, Key Lab Geog Informat Sci, Minist Educ, Shanghai 200241, Peoples R China
[2] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China
[3] East China Normal Univ, Sch Geog Sci, Shanghai 200241, Peoples R China
[4] Sun Yat Sen Univ, Sch Aeronaut & Astronaut, Shenzhen 518107, Peoples R China
[5] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
[6] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
[7] Engn Univ PAP, Xian 710086, Peoples R China
基金
中国国家自然科学基金;
关键词
Pixel-level labeling; Deep learning; DCNN; Dynamic convolution; Kernel engineering; NETWORK;
D O I
10.1016/j.eswa.2023.122695
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For achieving better performance, the majority of deep convolutional neural networks have endeavored to increase the model capacity by adding more convolutional layers or increasing the size of the filters. Consequently, the computational cost increases proportionally with the model capacity. This problem can be alleviated by dynamic convolution. In the case of pixel-level labeling, existing pixel-level dynamic convolution methods have a smaller scanning area than ordinary convolution or image-level dynamic convolution and are thus unable to exploit fine contextual information. As a consequence, pixel-level dynamic convolution is more sensitive to large-scale varying objects and confusion categories. In this paper, we propose a scale-pyramid dynamic atrous convolution (SDAConv) and exploit multi-scale pixel-level features in finer granularity, in order to efficiently increase model capacity, exploring contextual information, capture detail information and alleviate large-scale variation problem at the same time. Through kernel engineering (instead of network engineering), SDAConv dynamically arranges atrous filters in the individual convolutional kernels over different semantic areas at dense scales in the spatial dimension. By simply replacing the regular convolution with SDAConv in SOTA architectures, extensive experiments on three public datasets, Cityscapes, PASCAL VOC 2012 and ADE20K benchmarks demonstrate the superior performance of SDAConv on pixel-level labeling tasks.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Dynamic PVLC: Pixel-level Visible Light Communication Projector with Interactive Update of Images and Data
    Hiraki, Takefumi
    Fukushima, Shogo
    Watase, Hiroshi
    Naemura, Takeshi
    ITE TRANSACTIONS ON MEDIA TECHNOLOGY AND APPLICATIONS, 2019, 7 (04): : 160 - 168
  • [42] Pixel-level bridge crack detection using a deep fusion about recurrent residual convolution and context encoder network
    Li, Gang
    Li, Xiyuan
    Zhou, Jian
    Liu, Dezhi
    Ren, Wei
    MEASUREMENT, 2021, 176
  • [43] Pixel-Level Sequential TDC With Wide Dynamic Range for Large-Array Microbolometer IRFPAs
    Kim, Jongbeom
    Woo, Doohyung
    IEEE SENSORS JOURNAL, 2025, 25 (04) : 6572 - 6581
  • [44] Pixel-level mapping method in high dynamic range imaging system based on DMD modulation
    Guan, Xiaomei
    Qu, Xinghua
    Niu, Bin
    Zhang, Yuanjun
    Zhang, Fumin
    OPTICS COMMUNICATIONS, 2021, 499
  • [45] Multiscale Pixel-Level and Superpixel-Level Method for Hyperspectral Image Classification: Adaptive Attention and Parallel Multi-Hop Graph Convolution
    Yin, Junru
    Liu, Xuan
    Hou, Ruixia
    Chen, Qiqiang
    Huang, Wei
    Li, Aiguang
    Wang, Peng
    REMOTE SENSING, 2023, 15 (17)
  • [46] Pixel-level image fusion scheme based on steerable pyramid wavelet transform using absolute maximum selection fusion rule
    Prakash, Om
    Kumar, Arvind
    Khare, Ashish
    PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ISSUES AND CHALLENGES IN INTELLIGENT COMPUTING TECHNIQUES (ICICT), 2014, : 765 - 770
  • [47] Semantic Recognition of Human-Object Interactions via Gaussian-Based Elliptical Modeling and Pixel-Level Labeling
    Khalid, Nida
    Ghadi, Yazeed Yasin
    Gochoo, Munkhjargal
    Jalal, Ahmad
    Kim, Kibum
    IEEE ACCESS, 2021, 9 : 111249 - 111266
  • [48] DMFNet: geometric multi-scale pixel-level contrastive learning for video salient object detection
    Singh, Hemraj
    Verma, Mridula
    Cheruku, Ramalingaswamy
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (02)
  • [49] Semantic Recognition of Human-Object Interactions via Gaussian-Based Elliptical Modeling and Pixel-Level Labeling
    Khalid, Nida
    Ghadi, Yazeed Yasin
    Gochoo, Munkhjargal
    Jalal, Ahmad
    Kim, Kibum
    IEEE Access, 2021, 9 : 111249 - 111266
  • [50] Pixel-level Diabetic Retinopathy Lesion Detection Using Multi-scale Convolutional Neural Network
    Li, Qi
    Peng, Chenglei
    Ma, Yazhen
    Du, Sidan
    Guo, Bin
    Li, Yang
    2021 IEEE 3RD GLOBAL CONFERENCE ON LIFE SCIENCES AND TECHNOLOGIES (IEEE LIFETECH 2021), 2021, : 438 - 440