Motion-Aware Memory Network for Fast Video Salient Object Detection

被引:3
|
作者
Zhao, Xing [1 ]
Liang, Haoran [1 ]
Li, Peipei [2 ]
Sun, Guodao [1 ]
Zhao, Dongdong [1 ]
Liang, Ronghua [1 ]
He, Xiaofei [1 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Technol, Coll Mech Engn, Hangzhou 310023, Peoples R China
关键词
Video salient object detection; salient object detection; memory network; feature fusion; OPTIMIZATION;
D O I
10.1109/TIP.2023.3348659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video salient object detection (VSOD). However, these methods still suffer from high computational costs or poor quality of the generated saliency maps. To address this, we design a space-time memory (STM)-based network that employs a standard encoder-decoder architecture. During the encoding stage, we extract high-level temporal features from the current frame and its adjacent frames, which is more efficient and practical than methods reliant on optical flow. During the decoding stage, we introduce an effective fusion strategy for both spatial and temporal branches. The semantic information of the high-level features is used to improve the object details in the low-level features. Subsequently, spatiotemporal features are methodically derived step by step to reconstruct the saliency maps. Moreover, inspired by the boundary supervision prevalent in image salient object detection (ISOD), we design a motion-aware loss that predicts object boundary motion, and simultaneously perform multitask learning for VSOD and object motion prediction. This can further enhance the model's capability to accurately extract spatiotemporal features while maintaining object integrity. Extensive experiments on several datasets demonstrate the effectiveness of our method and can achieve state-of-the-art metrics on some datasets. Our proposed model does not require optical flow or additional preprocessing, and can reach an impressive inference speed of nearly 100 FPS.
引用
收藏
页码:709 / 721
页数:13
相关论文
共 50 条
  • [21] Motion Context guided Edge-preserving network for video salient object detection
    Huang, Kan
    Tian, Chunwei
    Xu, Zhijing
    Li, Nannan
    Lin, Jerry Chun-Wei
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 233
  • [22] Motion-Aware KNN Laplacian for Video Matting
    Li, Dingzeyu
    Chen, Qifeng
    Tang, Chi-Keung
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 3599 - 3606
  • [23] Pyramid Constrained Self-Attention Network for Fast Video Salient Object Detection
    Gu, Yuchao
    Wang, Lijuan
    Wang, Ziqin
    Liu, Yun
    Cheng, Ming-Ming
    Lu, Shao-Ping
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10869 - 10876
  • [24] Video object segmentation based on motion-aware ROI prediction and adaptive reference updating
    Fu, Lihua
    Zhao, Yu
    Sun, Xiaowei
    Huang, Jialiang
    Wang, Dan
    Ding, Yu
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 167
  • [25] GUIDANCE AND TEACHING NETWORK FOR VIDEO SALIENT OBJECT DETECTION
    Jiao, Yingxia
    Wang, Xiao
    Chou, Yu-Cheng
    Yang, Shouyuan
    Ji, Ge-Peng
    Zhu, Rong
    Gao, Ge
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2199 - 2203
  • [26] Visual Discomfort Induced by Fast Salient Object Motion in Stereoscopic Video
    Lee, Seong-il
    Jung, Yong Ju
    Sohn, Hosik
    Ro, Yong Man
    Park, Hyun Wook
    STEREOSCOPIC DISPLAYS AND APPLICATIONS XXII, 2011, 7863
  • [27] Motion-Aware Decoding of Compressed-Sensed Video
    Liu, Ying
    Li, Ming
    Pados, Dimitris A.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2013, 23 (03) : 438 - 444
  • [28] MAT: Motion-aware multi-object tracking
    Han, Shoudong
    Huang, Piao
    Wang, Hongwei
    Yu, En
    Liu, Donghaisheng
    Pan, Xiaofeng
    NEUROCOMPUTING, 2022, 476 : 75 - 86
  • [29] Part-aware attention correctness for video salient object detection
    Liu, Ze-yu
    Liu, Jian-wei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119
  • [30] Spatial context-aware network for salient object detection
    Kong, Yuqiu
    Feng, Mengyang
    Li, Xin
    Lu, Huchuan
    Liu, Xiuping
    Yin, Baocai
    PATTERN RECOGNITION, 2021, 114