Motion-Aware Memory Network for Fast Video Salient Object Detection

被引:3
|
作者
Zhao, Xing [1 ]
Liang, Haoran [1 ]
Li, Peipei [2 ]
Sun, Guodao [1 ]
Zhao, Dongdong [1 ]
Liang, Ronghua [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Mech Engn, Hangzhou 310023, Peoples R China
关键词
Video salient object detection; salient object detection; memory network; feature fusion; OPTIMIZATION;
D O I
10.1109/TIP.2023.3348659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video salient object detection (VSOD). However, these methods still suffer from high computational costs or poor quality of the generated saliency maps. To address this, we design a space-time memory (STM)-based network that employs a standard encoder-decoder architecture. During the encoding stage, we extract high-level temporal features from the current frame and its adjacent frames, which is more efficient and practical than methods reliant on optical flow. During the decoding stage, we introduce an effective fusion strategy for both spatial and temporal branches. The semantic information of the high-level features is used to improve the object details in the low-level features. Subsequently, spatiotemporal features are methodically derived step by step to reconstruct the saliency maps. Moreover, inspired by the boundary supervision prevalent in image salient object detection (ISOD), we design a motion-aware loss that predicts object boundary motion, and simultaneously perform multitask learning for VSOD and object motion prediction. This can further enhance the model's capability to accurately extract spatiotemporal features while maintaining object integrity. Extensive experiments on several datasets demonstrate the effectiveness of our method and can achieve state-of-the-art metrics on some datasets. Our proposed model does not require optical flow or additional preprocessing, and can reach an impressive inference speed of nearly 100 FPS.
引用
收藏
页码:709 / 721
页数:13
相关论文
共 50 条
  • [41] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
    Lu, Xuemin
    Quan, Wei
    Marek, Reformat
    Zhao, Haiquan
    Chen, Jim X. X.
    VISUAL COMPUTER, 2024, 40 (05): : 3163 - 3181
  • [42] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
    Xuemin Lu
    Wei Quan
    Reformat Marek
    Haiquan Zhao
    Jim X. Chen
    The Visual Computer, 2024, 40 : 3163 - 3181
  • [43] Cross Complementary Fusion Network for Video Salient Object Detection
    Wang, Ziyang
    Li, Junxia
    Pan, Zefeng
    IEEE ACCESS, 2020, 8 : 201259 - 201270
  • [44] Flow driven attention network for video salient object detection
    Zhou, Feng
    Shuai, Hui
    Liu, Qingshan
    Guo, Guodong
    IET IMAGE PROCESSING, 2020, 14 (06) : 997 - 1004
  • [45] PSNet: Parallel Symmetric Network for Video Salient Object Detection
    Cong, Runmin
    Song, Weiyu
    Lei, Jianjun
    Yue, Guanghui
    Zhao, Yao
    Kwong, Sam
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (02): : 402 - 414
  • [46] IENet: inheritance enhancement network for video salient object detection
    Jiang, Tao
    Wang, Yi
    Hou, Feng
    Wang, Ruili
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (28) : 72007 - 72026
  • [47] Motion-aware future frame prediction for video anomaly detection based on saliency perception
    Haitao Xu
    Weibin Liu
    Weiwei Xing
    Xiang Wei
    Signal, Image and Video Processing, 2022, 16 : 2121 - 2129
  • [48] VIDEO FRAME INTERPOLATION VIA EXCEPTIONAL MOTION-AWARE SYNTHESIS
    Park, Minho
    Lee, Sangmin
    Ro, Yong Man
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1958 - 1962
  • [49] Towards Motion-Aware Light Field Video for Dynamic Scenes
    Tambe, Salil
    Veeraraghavan, Ashok
    Agrawal, Amit
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1009 - 1016
  • [50] Attentive Feedback Network for Boundary-Aware Salient Object Detection
    Feng, Mengyang
    Lu, Huchuan
    Ding, Errui
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1623 - 1632