STI-Net: Spatiotemporal integration network for video saliency detection

被引:14
|
作者
Zhou, Xiaofei [1 ]
Cao, Weipeng [2 ]
Gao, Hanxiao [1 ]
Ming, Zhong [2 ]
Zhang, Jiyong [1 ]
机构
[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310018, Peoples R China
[2] Guangdong Lab Artificial Intelligence & Digital Ec, Shenzhen 518107, Peoples R China
基金
中国国家自然科学基金;
关键词
Spatiotemporal saliency; Feature aggregation; Saliency prediction; Saliency fusion; OBJECT DETECTION; FUSION; SEGMENTATION; ATTENTION; FEATURES;
D O I
10.1016/j.ins.2023.01.106
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image saliency detection, to which much effort has been devoted in recent years, has advanced significantly. In contrast, the community has paid little attention to video saliency detection. Especially, existing video saliency models are very likely to fail in videos with difficult scenarios such as fast motion, dynamic background, and nonrigid deformation. Furthermore, performing video saliency detection directly using image saliency models that ignore video temporal information is inappropriate. To alleviate this issue, this study proposes a novel end-to-end spatiotemporal integration network (STI-Net) for detecting salient objects in videos. Specifically, our method is made up of three key steps: feature aggregation, saliency prediction, and saliency fusion, which are used sequentially to generate spatiotemporal deep feature maps, coarse saliency predictions, and the final saliency map. The key advantage of our model lies in the comprehensive exploration of spatial and temporal information across the entire network, where the two features interact with each other in the feature aggregation step, are used to construct boundary cue in the saliency prediction step, and also serve as the original information in the saliency fusion step. As a result, the generated spatiotemporal deep feature maps can precisely and completely characterize the salient objects, and the coarse saliency predictions have well-defined boundaries, effectively improving the final saliency map's quality. Furthermore, "shortcut connections" are introduced into our model to make the proposed network easy to train and obtain accurate results when the network is deep. Extensive experimental results on two publicly available challenging video datasets demonstrate the effectiveness of the proposed model, which achieves comparable performance to state-of-the-art saliency models.
引用
收藏
页码:134 / 147
页数:14
相关论文
共 50 条
  • [21] STEG-Net: Spatiotemporal Edge Guidance Network for Video Salient Object Detection
    Bi, Hongbo
    Yang, Lina
    Zhu, Huihui
    Lu, Di
    Jiang, Jianguo
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (03) : 902 - 915
  • [22] Superpixel-based video saliency detection via the fusion of spatiotemporal saliency and temporal coherency
    Li, Yandi
    Xu, Xiping
    Zhang, Ning
    Du, Enyu
    OPTICAL ENGINEERING, 2019, 58 (08)
  • [23] GRAPH-THEORETIC SPATIOTEMPORAL CONTEXT MODELING FOR VIDEO SALIENCY DETECTION
    Wei, Lina
    Wang, Fangfang
    Li, Xi
    Wu, Fei
    Xiao, Jun
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 4197 - 4201
  • [24] Unsupervised Uncertainty Estimation Using Spatiotemporal Cues in Video Saliency Detection
    Alshawi, Tariq
    Long, Zhiling
    AlRegib, Ghassan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (06) : 2818 - 2827
  • [25] Video Saliency Detection Using Multi-level Spatiotemporal Orientation
    Liu, Zhao
    Wang, Zhenyang
    Song, Xinhui
    Chen, Chun
    2015 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2015,
  • [26] A spatiotemporal weighted dissimilarity-based method for video saliency detection
    Duan, Lijuan
    Xi, Tao
    Cui, Song
    Qi, Honggang
    Bovik, Alan C.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2015, 38 : 45 - 56
  • [27] Spatiotemporal Saliency Detection for Video Sequences Based on Random Walk With Restart
    Kim, Hansang
    Kim, Youngbae
    Sim, Jae-Young
    Kim, Chang-Su
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (08) : 2552 - 2564
  • [28] Improving Video Saliency Detection via Localized Estimation and Spatiotemporal Refinement
    Zhou, Xiaofei
    Liu, Zhi
    Gong, Chen
    Liu, Wei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (11) : 2993 - 3007
  • [29] A Spatiotemporal Saliency Model for Video Surveillance
    Tong Yubing
    Cheikh, Faouzi Alaya
    Guraya, Fahad Fazal Elahi
    Konik, Hubert
    Tremeau, Alain
    COGNITIVE COMPUTATION, 2011, 3 (01) : 241 - 263
  • [30] A Spatiotemporal Saliency Model for Video Surveillance
    Tong Yubing
    Faouzi Alaya Cheikh
    Fahad Fazal Elahi Guraya
    Hubert Konik
    Alain Trémeau
    Cognitive Computation, 2011, 3 : 241 - 263