CoT: Contourlet Transformer for Hierarchical Semantic Segmentation

被引:0
|
作者
Shao, Yilin [1 ]
Sun, Long [1 ]
Jiao, Licheng [1 ]
Liu, Xu [1 ]
Liu, Fang [1 ]
Li, Lingling [1 ]
Yang, Shuyuan [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Int Res Ctr Intelligent Percept & Computat, Minist Educ China,Key Lab Intelligent Percept & I, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Semantics; Semantic segmentation; Task analysis; Computed tomography; Convolutional neural networks; Contourlet transform (CT); semantic segmentation; sparse convolution; Transformer-convolutional neural network (CNN) hybrid model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer-convolutional neural network (CNN) hybrid learning approach is gaining traction for balancing deep and shallow image features for hierarchical semantic segmentation. However, they are still confronted with a contradiction between comprehensive semantic understanding and meticulous detail extraction. To solve this problem, this article proposes a novel Transformer-CNN hybrid hierarchical network, dubbed contourlet transformer (CoT). In the CoT framework, the semantic representation process of the Transformer is unavoidably peppered with sparsely distributed points that, while not desired, demand finer detail. Therefore, we design a deep detail representation (DDR) structure to investigate their fine-grained features. First, through contourlet transform (CT), we distill the high-frequency directional components from the raw image, yielding localized features that accommodate the inductive bias of CNN. Second, a CNN deep sparse learning (DSL) module takes them as input to represent the underlying detailed features. This memory- and energy-efficient learning method can keep the same sparse pattern between input and output. Finally, the decoder hierarchically fuses the detailed features with the semantic features via an image reconstruction-like fashion. Experiments demonstrate that CoT achieves competitive performance on three benchmark datasets: PASCAL Context [57.21% mean intersection over union (mIoU)], ADE20K (54.16% mIoU), and Cityscapes (84.23% mIoU). Furthermore, we conducted robustness studies to validate its resistance against various sorts of corruption. Our code is available at: https://github.com/yilinshao/CoT-Contourlet-Transformer.
引用
收藏
页码:132 / 146
页数:15
相关论文
共 50 条
  • [31] TransRSS: Transformer-based Radar Semantic Segmentation
    Zou, Hao
    Xie, Zhen
    Ou, Jiarong
    Gao, Yutao
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 6965 - 6972
  • [32] A reversible transformer for LiDAR point cloud semantic segmentation
    Akwensi, Perpertual Hope
    Wang, Ruisheng
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 19 - 28
  • [33] Video Semantic Segmentation via Sparse Temporal Transformer
    Li, Jiangtong
    Wang, Wentao
    Chen, Junjie
    Niu, Li
    Si, Jianlou
    Qian, Chen
    Zhang, Liqing
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 59 - 68
  • [34] Class-Prompting Transformer for Incremental Semantic Segmentation
    Song, Zichen
    Shi, Zhaofeng
    Shang, Chao
    Meng, Fanman
    Xu, Linfeng
    IEEE ACCESS, 2023, 11 : 100154 - 100164
  • [35] TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
    Zhang, Wenqiang
    Huang, Zilong
    Luo, Guozhong
    Chen, Tao
    Wang, Xinggang
    Liu, Wenyu
    Yu, Gang
    Shen, Chunhua
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12073 - 12083
  • [36] FeedFormer: Revisiting Transformer Decoder for Efficient Semantic Segmentation
    Shim, Jae-hun
    Yu, Hyunwoo
    Kong, Kyeongbo
    Kang, Suk-Ju
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2263 - 2271
  • [37] Semantic segmentation feature fusion network based on transformer
    Li, Tianping
    Cui, Zhaotong
    Zhang, Hua
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [38] LNFormer: Lightweight Design for Nighttime Semantic Segmentation With Transformer
    Wei, Longsheng
    Liao, Yuhang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
  • [39] SARFormer: Segmenting Anything Guided Transformer for semantic segmentation
    Zhang, Lixin
    Huang, Wenteng
    Fan, Bin
    NEUROCOMPUTING, 2025, 635
  • [40] Full-Scale Selective Transformer for Semantic Segmentation
    Lin, Fangjian
    Wu, Sitong
    Ma, Yizhe
    Tian, Shengwei
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 310 - 326