TANet: Transformer-based asymmetric network for RGB-D salient object detection

被引:6
|
作者
Liu, Chang [1 ]
Yang, Gang [1 ,3 ]
Wang, Shuo [1 ]
Wang, Hangxu [1 ,2 ]
Zhang, Yunhua [1 ]
Wang, Yutao [1 ]
机构
[1] Northeastern Univ, Shenyang, Liaoning, Peoples R China
[2] DUT Artificial Intelligence Inst, Dalian, Peoples R China
[3] Northeastern Univ, Wenhua Rd, Shenyang 110000, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
computer vision; image segmentation; object detection; REGION;
D O I
10.1049/cvi2.12177
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing RGB-D salient object detection methods mainly rely on a symmetric two-stream Convolutional Neural Network (CNN)-based network to extract RGB and depth channel features separately. However, there are two problems with the symmetric conventional network structure: first, the ability of CNN in learning global contexts is limited; second, the symmetric two-stream structure ignores the inherent differences between modalities. In this study, a Transformer-based asymmetric network is proposed to tackle the issues mentioned above. The authors employ the powerful feature extraction capability of Transformer to extract global semantic information from RGB data and design a lightweight CNN backbone to extract spatial structure information from depth data without pre-training. The asymmetric hybrid encoder effectively reduces the number of parameters in the model while increasing speed without sacrificing performance. Then, a cross-modal feature fusion module which enhances and fuses RGB and depth features with each other is designed. Finally, the authors add edge prediction as an auxiliary task and propose an edge enhancement module to generate sharper contours. Extensive experiments demonstrate that our method achieves superior performance over 14 state-of-the-art RGB-D methods on six public datasets. The code of the authors will be released at .
引用
收藏
页码:415 / 430
页数:16
相关论文
共 50 条
  • [31] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection
    Li, Gongyang
    Liu, Zhi
    Chen, Minyu
    Bai, Zhen
    Lin, Weisi
    Ling, Haibin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3528 - 3542
  • [32] Hybrid-Attention Network for RGB-D Salient Object Detection
    Chen, Yuzhen
    Zhou, Wujie
    APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [33] Feature Calibrating and Fusing Network for RGB-D Salient Object Detection
    Zhang, Qiang
    Qin, Qi
    Yang, Yang
    Jiao, Qiang
    Han, Jungong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1493 - 1507
  • [34] Triple-Complementary Network for RGB-D Salient Object Detection
    Huang, Rui
    Xing, Yan
    Zou, Yaobin
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) : 775 - 779
  • [35] DMNet: Dynamic Memory Network for RGB-D Salient Object Detection
    Du, Haishun
    Zhang, Zhen
    Zhang, Minghao
    Qiao, Kangyi
    DIGITAL SIGNAL PROCESSING, 2023, 142
  • [36] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection
    Li, Gongyang
    Liu, Zhi
    Chen, Minyu
    Bai, Zhen
    Lin, Weisi
    Ling, Haibin
    IEEE Transactions on Image Processing, 2021, 30 : 3528 - 3542
  • [37] An adaptive guidance fusion network for RGB-D salient object detection
    Sun, Haodong
    Wang, Yu
    Ma, Xinpeng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1683 - 1693
  • [38] Context-aware network for RGB-D salient object detection
    Liang, Fangfang
    Duan, Lijuan
    Ma, Wei
    Qiao, Yuanhua
    Miao, Jun
    Ye, Qixiang
    PATTERN RECOGNITION, 2021, 111
  • [39] CDNet: Complementary Depth Network for RGB-D Salient Object Detection
    Jin, Wen-Da
    Xu, Jun
    Han, Qi
    Zhang, Yi
    Cheng, Ming-Ming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3376 - 3390
  • [40] Scale Adaptive Fusion Network for RGB-D Salient Object Detection
    Kong, Yuqiu
    Zheng, Yushuo
    Yao, Cuili
    Liu, Yang
    Wang, He
    COMPUTER VISION - ACCV 2022, PT III, 2023, 13843 : 608 - 625