CARD: Semantic Segmentation With Efficient Class-Aware Regularized Decoder

被引:0
|
作者
Huang, Ye [1 ]
Kang, Di [2 ]
Chen, Liang [3 ]
Jia, Wenjing [4 ]
He, Xiangjian [5 ]
Duan, Lixin [1 ]
Zhe, Xuefei [2 ]
Bao, Linchao [6 ]
机构
[1] Univ Elect Sci & Technol China, Shenzhen Inst Adv Study, Shenzhen 518000, Peoples R China
[2] Tencent AI Lab, Shenzhen 518000, Peoples R China
[3] Fujian Normal Univ, Coll Photon & Elect Engn, Fuzhou 350007, Peoples R China
[4] Univ Technol Sydney, Global Big Data & Technol Ctr, Sydney, NSW 2007, Australia
[5] Univ Nottingham Ningbo China, Sch Comp Sci, Ningbo 315000, Peoples R China
[6] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
关键词
Automobiles; Feature extraction; Semantic segmentation; Task analysis; Decoding; Training; Cows; representation learning; cityscapes; Pascal context; COCOStuff;
D O I
10.1109/TCSVT.2024.3395132
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Semantic segmentation has recently achieved notable advances by exploiting "class-level" contextual information during learning, e.g., the Object Contextual Representation (OCR) and Context Prior (CPNet) approaches. However, these approaches simply concatenate class-level information to pixel features to boost pixel representation learning, which cannot fully utilize intra-class and inter-class contextual information. Moreover, these approaches learn soft class centers based on coarse mask prediction, which is prone to error accumulation. To better exploit class-level information, we propose a universal Class-Aware Regularization (CAR) approach to optimize the intra-class variance and inter-class distance during feature learning, motivated by the fact that humans can recognize an object by itself no matter which other objects it appears with. Moreover, we design a dedicated decoder for CAR (named CARD), which consists of a novel spatial token mixer and an upsampling module, to maximize its gain for existing baselines while being highly efficient in terms of computational cost. Specifically, CAR consists of three novel loss functions. The first loss function encourages more compact class representations within each class, the second directly maximizes the distance between different class centers, and the third further pushes the distance between inter-class centers and pixels. Furthermore, the class center in our approach is directly generated from ground truth instead of from the error-prone coarse prediction. CAR can be directly applied to most existing segmentation models during training, including OCR and CPNet, and can largely improve their accuracy at no additional inference overhead. Extensive experiments and ablation studies conducted on multiple benchmark datasets demonstrate that the proposed CAR can boost the accuracy of all baseline models by up to 2.23% mIOU with superior generalization ability. CARD outperforms state-of-the-art approaches on multiple benchmarks with a highly efficient architecture. The code will be available at https://github.com/edwardyehuang/CAR.
引用
收藏
页码:9024 / 9038
页数:15
相关论文
共 50 条
  • [21] DEEP CLASS-AWARE IMAGE DENOISING
    Remez, Tal
    Litany, Or
    Giryes, Raja
    Bronstein, Alex M.
    2017 INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2017, : 138 - 142
  • [22] Efficient cross-information fusion decoder for semantic segmentation
    Zhang, Songyang
    Ren, Ge
    Zeng, Xiaoxi
    Zhang, Liang
    Du, Kailun
    Liu, Gege
    Lin, Hong
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 240
  • [23] Efficient Decoder and Intermediate Domain for Semantic Segmentation in Adverse Conditions
    Chen, Xiaodong
    Jiang, Nan
    Li, Yifeng
    Cheng, Guangliang
    Liang, Zheng
    Ying, Zuobin
    Zhang, Qi
    Zhao, Runsheng
    SMART CITIES, 2024, 7 (01): : 254 - 276
  • [24] DFNet: efficient decoder-free semantic segmentation networks
    Liu, Lamei
    Du, Baochang
    Huang, Huiling
    Zhang, Yongjian
    Han, Jun
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2024, 39 (02) : 121 - 130
  • [25] Efficient Semantic Segmentation Using Multi-Path Decoder
    Bai, Xing
    Zhou, Jun
    APPLIED SCIENCES-BASEL, 2020, 10 (18):
  • [26] DEEP CLASS-AWARE IMAGE DENOISING
    Remez, Tal
    Litany, Or
    Giryes, Raja
    Bronstein, Alex M.
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1895 - 1899
  • [27] Using Channel-Wise Attention for Deep CNN Based Real-Time Semantic Segmentation With Class-Aware Edge Information
    Han, Hsiang-Yu
    Chen, Yu-Chi
    Hsiao, Pei-Yung
    Fu, Li-Chen
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (02) : 1041 - 1051
  • [28] A Joint Framework Towards Class-aware and Class-agnostic Alignment for Few-shot Segmentation
    Huang, Kai
    Cheng, Mingfei
    Wang, Yang
    Wang, Bochen
    Xi, Ye
    Wang, Feigege
    Chen, Peng
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 431 - 447
  • [29] Unsupervised domain adaptation for the semantic segmentation of remote sensing images via a class-aware Fourier transform and a fine-grained discriminator
    Ismael, Sarmad F.
    Kayabol, Koray
    Aptoula, Erchan
    DIGITAL SIGNAL PROCESSING, 2024, 151
  • [30] CaCL: Class-Aware Codebook Learning for Weakly Supervised Segmentation on Diffuse Image Patterns
    Deng, Ruining
    Liu, Quan
    Bao, Shunxing
    Jha, Aadarsh
    Chang, Catie
    Millis, Bryan A.
    Tyska, Matthew J.
    Huo, Yuankai
    DEEP GENERATIVE MODELS, AND DATA AUGMENTATION, LABELLING, AND IMPERFECTIONS, 2021, 13003 : 93 - 102