CoT: Contourlet Transformer for Hierarchical Semantic Segmentation

被引:0
|
作者
Shao, Yilin [1 ]
Sun, Long [1 ]
Jiao, Licheng [1 ]
Liu, Xu [1 ]
Liu, Fang [1 ]
Li, Lingling [1 ]
Yang, Shuyuan [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Int Res Ctr Intelligent Percept & Computat, Minist Educ China,Key Lab Intelligent Percept & I, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Semantics; Semantic segmentation; Task analysis; Computed tomography; Convolutional neural networks; Contourlet transform (CT); semantic segmentation; sparse convolution; Transformer-convolutional neural network (CNN) hybrid model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer-convolutional neural network (CNN) hybrid learning approach is gaining traction for balancing deep and shallow image features for hierarchical semantic segmentation. However, they are still confronted with a contradiction between comprehensive semantic understanding and meticulous detail extraction. To solve this problem, this article proposes a novel Transformer-CNN hybrid hierarchical network, dubbed contourlet transformer (CoT). In the CoT framework, the semantic representation process of the Transformer is unavoidably peppered with sparsely distributed points that, while not desired, demand finer detail. Therefore, we design a deep detail representation (DDR) structure to investigate their fine-grained features. First, through contourlet transform (CT), we distill the high-frequency directional components from the raw image, yielding localized features that accommodate the inductive bias of CNN. Second, a CNN deep sparse learning (DSL) module takes them as input to represent the underlying detailed features. This memory- and energy-efficient learning method can keep the same sparse pattern between input and output. Finally, the decoder hierarchically fuses the detailed features with the semantic features via an image reconstruction-like fashion. Experiments demonstrate that CoT achieves competitive performance on three benchmark datasets: PASCAL Context [57.21% mean intersection over union (mIoU)], ADE20K (54.16% mIoU), and Cityscapes (84.23% mIoU). Furthermore, we conducted robustness studies to validate its resistance against various sorts of corruption. Our code is available at: https://github.com/yilinshao/CoT-Contourlet-Transformer.
引用
收藏
页码:132 / 146
页数:15
相关论文
共 50 条
  • [21] MMSFormer: Multimodal Transformer for Material and Semantic Segmentation
    Reza, Md Kaykobad
    Prater-Bennette, Ashley
    Asif, M. Salman
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 599 - 610
  • [22] Multiview Fusion Driven 3-D Point Cloud Semantic Segmentation Based on Hierarchical Transformer
    Xu, Wang
    Li, Xu
    Ni, Peizhou
    Guang, Xingxing
    Luo, Hang
    Zhao, Xijun
    IEEE SENSORS JOURNAL, 2023, 23 (24) : 31461 - 31470
  • [23] Semantic segmentation using tag label and transformer
    Jeong S.-W.
    Kim E.-C.
    Yoo J.
    Journal of Institute of Control, Robotics and Systems, 2021, 27 (12) : 1029 - 1037
  • [24] FusionFormer: An Off-Road Sence Semantic Segmentation Network Based on Data Fusion and Hierarchical Transformer
    Duan, AnZhi
    Ma, Yue
    Wang, YunFeng
    PROCEEDINGS OF 2024 CHINESE INTELLIGENT SYSTEMS CONFERENCE, VOL 3, CISC 2024, 2024, 1285 : 75 - 83
  • [25] MarsFormer: Martian Rock Semantic Segmentation With Transformer
    Xiong, Yonggang
    Xiao, Xueming
    Yao, Meibao
    Liu, Haiqiang
    Yang, Hong
    Fu, Yuegang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [26] CoT-TransUNet:Lightweight Context Transformer Medical Image Segmentation Network
    Yang, He
    Bai, Zhengyao
    Computer Engineering and Applications, 2024, 59 (03) : 218 - 225
  • [27] Semantic Instance Labeling Leveraging Hierarchical Segmentation
    Hickson, Steven
    Essa, Irfan
    Christensen, Henrik
    2015 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2015, : 1068 - 1075
  • [28] Semantic Image Segmentation with Contextual Hierarchical Models
    Seyedhosseini, Mojtaba
    Tasdizen, Tolga
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (05) : 951 - 964
  • [29] HCNet: Hierarchical Context Network for Semantic Segmentation
    Chong, Yanwen
    Nie, Congchong
    Tao, Yulong
    Chen, Xiaoshu
    Pan, Shaoming
    IEEE ACCESS, 2020, 8 : 179213 - 179223
  • [30] Laformer: Vision Transformer for Panoramic Image Semantic Segmentation
    Yuan, Zheng
    Wang, Junhua
    Lv, Yuxin
    Wang, Ding
    Fang, Yi
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1792 - 1796