STA-Net: spatial-temporal attention network for video salient object detection

被引:23
|
作者
Bi, Hong-Bo [1 ]
Lu, Di [1 ]
Zhu, Hui-Hui [1 ]
Yang, Li-Na [1 ]
Guan, Hua-Ping [2 ]
机构
[1] NorthEast Petr Univ, Daqing, Peoples R China
[2] Fujian Normal Univ, Fuzhou, Peoples R China
关键词
Multi-scale; Video salient object detection; Attention; Pyramid; SEGMENTATION; OPTIMIZATION;
D O I
10.1007/s10489-020-01961-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper conducts a systematic study on the role of spatial and temporal attention mechanism in the video salient object detection (VSOD) task. We present a two-stage spatial-temporal attention network, named STA-Net, which makes two major contributions. In the first stage, we devise a Multi-Scale-Spatial-Attention (MSSA) module to reduce calculation cost on non-salient regions while exploiting multi-scale saliency information. Such a sliced attention method offers an individual way to efficiently exploit the high-level features of the network with an enlarged receptive field. The second stage is to propose a Pyramid-Saliency-Shift-Aware (PSSA) module, which puts emphasis on the importance of dynamic object information since it offers a valid shift cue to confirm salient object and capture temporal information. Such a temporal detection module is able to encourage precise salient region detection. Exhaustive experiments show that the proposed STA-Net is effective for video salient object detection task, and achieves compelling performance in comparison with state-of-the-art.
引用
收藏
页码:3450 / 3459
页数:10
相关论文
共 50 条
  • [1] STA-Net: spatial-temporal attention network for video salient object detection
    Hong-Bo Bi
    Di Lu
    Hui-Hui Zhu
    Li-Na Yang
    Hua-Ping Guan
    [J]. Applied Intelligence, 2021, 51 : 3450 - 3459
  • [2] Collaborative spatial-temporal video salient object detection with cross attention transformer
    Su, Yuting
    Wang, Weikang
    Liu, Jing
    Jing, Peiguang
    [J]. SIGNAL PROCESSING, 2024, 224
  • [3] STA-Net: A Spatial-Temporal Joint Attention Network for Driver Maneuver Recognition, Based on In-Cabin and Driving Scene Monitoring
    He, Bin
    Yu, Ningmei
    Wang, Zhiyong
    Chen, Xudong
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [4] Learning Complementary Spatial-Temporal Transformer for Video Salient Object Detection
    Liu, Nian
    Nan, Kepan
    Zhao, Wangbo
    Yao, Xiwen
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 10663 - 10673
  • [5] STA-TSN: Spatial-Temporal Attention Temporal Segment Network for action recognition in video
    Yang, Guoan
    Yang, Yong
    Lu, Zhengzhi
    Yang, Junjie
    Liu, Deyang
    Zhou, Chuanbo
    Fan, Zien
    [J]. PLOS ONE, 2022, 17 (03):
  • [6] SPATIAL-TEMPORAL FEATURE AGGREGATION NETWORK FOR VIDEO OBJECT DETECTION
    Chen, Zhu
    Li, Weihai
    Fei, Chi
    Liu, Bin
    Yu, Nenghai
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1858 - 1862
  • [7] Spatial-temporal graph attention network for video anomaly detection
    Chen, Haoyang
    Mei, Xue
    Ma, Zhiyuan
    Wu, Xinhong
    Wei, Yachuan
    [J]. IMAGE AND VISION COMPUTING, 2023, 131
  • [8] Attention Embedded Spatio-Temporal Network for Video Salient Object Detection
    Huang, Lili
    Yan, Pengxiang
    Li, Guanbin
    Wang, Qing
    Lin, Liang
    [J]. IEEE ACCESS, 2019, 7 : 166203 - 166213
  • [9] Spatial-Temporal Autoencoder with Attention Network for Video Compression
    Sigger, Neetu
    Al-Jawed, Naseer
    Nguyen, Tuan
    [J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT III, 2022, 13233 : 290 - 300
  • [10] Flow driven attention network for video salient object detection
    Zhou, Feng
    Shuai, Hui
    Liu, Qingshan
    Guo, Guodong
    [J]. IET IMAGE PROCESSING, 2020, 14 (06) : 997 - 1004