Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network

被引:4
|
作者
Sun, Zhenhao [1 ,2 ]
Wang, Xu [1 ,2 ]
Zhang, Qiudan [3 ]
Jiang, Jianmin [1 ,2 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Guangdong Lab Artificial Intelligence & Digital E, Shenzhen 518060, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Video saliency prediction; eye fixation dataset; 3D residual convolutional neural network; DETECTION MODEL;
D O I
10.1109/ACCESS.2019.2946479
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end video saliency prediction model via 3D residual convolutional neural network (3D-ResNet), which incorporates the prediction of spatial and temporal saliency maps into one single process. In particular, a multi-scale feature representation scheme is employed to further boost the model performance. Besides, a frame skipping strategy is proposed for speeding up the saliency map inference process. Moreover, a new challenging eye tracking database with 220 video clips is established to facilitate the research of video saliency prediction. Extensive experimental results show our model outperforms the state-of-the-art methods over the eye fixation datasets in terms of both prediction accuracy and inference speed.
引用
收藏
页码:147743 / 147754
页数:12
相关论文
共 50 条
  • [41] RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices
    Niu, Wei
    Sun, Mengshu
    Li, Zhengang
    Chen, Jou-An
    Guan, Jiexiong
    Shen, Xipeng
    Wang, Yanzhi
    Liu, Sijia
    Lin, Xue
    Ren, Bin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9179 - 9187
  • [42] Real-Time 3D Rotation Smoothing for Video Stabilization
    Jia, Chao
    Sinno, Zeina
    Evans, Brian L.
    CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2014, : 673 - 677
  • [43] Real-time 3D Video System Based on FPGA
    Li, Zan
    Ye, Xue Song
    Zhang, Hong
    Lu, Ling
    Lu, Chen
    Cheng, Li Cheng
    2013 3RD INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, COMMUNICATIONS AND NETWORKS (CECNET), 2013, : 469 - 472
  • [44] QUALITY EVALUATION FOR REAL-TIME 3D VIDEO SERVICES
    Hewage, Chaminda T. E. R.
    Martini, Maria G.
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [45] Real-time 3D video conference on generic hardware
    Desurmont, X.
    Bruyelle, J. L.
    Ruiz, D.
    Meessen, J.
    Macq, B.
    REAL-TIME IMAGE PROCESSING 2007, 2007, 6496
  • [46] A real-time continuous monitoring system for long-term voltage stability with sliding 3D convolutional neural network
    Cai, Huaxiang
    Hill, David J.
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 134
  • [47] Research on 3D Convolutional Neural Network and Its Application to Video Understanding
    Bai, Jing
    Yang, Zhanyuan
    Peng, Bin
    Li, Wenjing
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (06) : 2273 - 2283
  • [48] Smoke Video Detection Algorithm Based on 3D Convolutional Neural Network
    Shi, Zhen
    Sun, Rui
    Huo, Mingge
    Proceedings of the 34th Chinese Control and Decision Conference, CCDC 2022, 2022, : 692 - 697
  • [49] Smoke Video Detection Algorithm Based On 3D Convolutional Neural Network
    Shi, Zhen
    Sun, Rui
    Huo, Mingge
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 692 - 697
  • [50] Saliency Map Generation by the Convolutional Neural Network for Real-Time Traffic Light Detection Using Template Matching
    John, Vijay
    Yoneda, Keisuke
    Liu, Zheng
    Mita, Seiichi
    IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2015, 1 (03) : 159 - 173