Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network

被引:4
|
作者
Sun, Zhenhao [1 ,2 ]
Wang, Xu [1 ,2 ]
Zhang, Qiudan [3 ]
Jiang, Jianmin [1 ,2 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Guangdong Lab Artificial Intelligence & Digital E, Shenzhen 518060, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Video saliency prediction; eye fixation dataset; 3D residual convolutional neural network; DETECTION MODEL;
D O I
10.1109/ACCESS.2019.2946479
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end video saliency prediction model via 3D residual convolutional neural network (3D-ResNet), which incorporates the prediction of spatial and temporal saliency maps into one single process. In particular, a multi-scale feature representation scheme is employed to further boost the model performance. Besides, a frame skipping strategy is proposed for speeding up the saliency map inference process. Moreover, a new challenging eye tracking database with 220 video clips is established to facilitate the research of video saliency prediction. Extensive experimental results show our model outperforms the state-of-the-art methods over the eye fixation datasets in terms of both prediction accuracy and inference speed.
引用
收藏
页码:147743 / 147754
页数:12
相关论文
共 50 条
  • [1] VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition
    Maturana, Daniel
    Scherer, Sebastian
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 922 - 928
  • [2] A 3D Convolutional Neural Network Towards Real-time Amodal 3D Object Detection
    Sun, Hao
    Meng, Zehui
    Du, Xinxin
    Ang, Marcelo H., Jr.
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 8331 - 8338
  • [3] PointNet: A 3D Convolutional Neural Network for Real-Time Object Class Recognition
    Garcia-Garcia, A.
    Gomez-Donoso, F.
    Garcia-Rodriguez, J.
    Orts-Escolano, S.
    Cazorla, M.
    Azorin-Lopez, J.
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1578 - 1584
  • [4] A Real-time Multimodal Hand Gesture Recognition via 3D Convolutional Neural Network and Key Frame Extraction
    Nguyen Ngoc Hoang
    Lee, Guee-Sang
    Kim, Soo-Hyung
    Yang, Hyung-Jeong
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND MACHINE INTELLIGENCE (MLMI 2018), 2018, : 32 - 37
  • [5] Video Visual Relation Detection via 3D Convolutional Neural Network
    Qu, Mingcheng
    Cui, Jianxun
    Su, Tonghua
    Deng, Ganlin
    Shao, Wenkai
    IEEE ACCESS, 2022, 10 : 23748 - 23756
  • [6] Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks
    Ge, Liuhao
    Liang, Hui
    Yuan, Junsong
    Thalmann, Daniel
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (04) : 956 - 970
  • [7] Real-Time Video Object Recognition Using Convolutional Neural Network
    Ahn, Byungik
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [8] Real-time object segmentation based on convolutional neural network with saliency optimization for picking
    Chen Jinbo
    Wang Zhiheng
    Li Hengyu
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2018, 29 (06) : 1300 - 1307
  • [9] Real-time object segmentation based on convolutional neural network with saliency optimization for picking
    CHEN Jinbo
    WANG Zhiheng
    LI Hengyu
    Journal of Systems Engineering and Electronics, 2018, 29 (06) : 1300 - 1307
  • [10] Real-Time Gesture Recognition Using 3D Sensory Data and a Light Convolutional Neural Network
    Diliberti, Nicholas
    Peng, Chao
    Kauffman, Christopher
    Dong, Yangzi
    Hansberger, Jeffrey T.
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 401 - 410