Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment

被引:35
|
作者
Chen, Baoliang [1 ]
Zhu, Lingyu [1 ]
Li, Guo [2 ]
Lu, Fangbo [2 ]
Fan, Hongfei [2 ]
Wang, Shiqi [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Kingsoft Cloud, Beijing 100000, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Quality assessment; Training; Video recording; Image quality; Streaming media; Nonlinear distortion; Video quality assessment; generalization capability; deep neural networks; temporal aggregation; IMAGE; STATISTICS; DATABASE;
D O I
10.1109/TCSVT.2021.3088505
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this work, we propose a no-reference video quality assessment method, aiming to achieve high-generalization capability in cross-content, -resolution and -frame rate quality prediction. In particular, we evaluate the quality of a video by learning effective feature representations in spatial-temporal domain. In the spatial domain, to tackle the resolution and content variations, we impose the Gaussian distribution constraints on the quality features. The unified distribution can significantly reduce the domain gap between different video samples, resulting in more generalized quality feature representation. Along the temporal dimension, inspired by the mechanism of visual perception, we propose a pyramid temporal aggregation module by involving the short-term and long-term memory to aggregate the frame-level quality. Experiments show that our method outperforms the state-of-the-art methods on cross-dataset settings, and achieves comparable performance on intra-dataset configurations, demonstrating the high-generalization capability of the proposed method. The codes are released at https://github.com/Baoliang93/GSTVQA
引用
收藏
页码:1903 / 1916
页数:14
相关论文
共 50 条
  • [1] No-Reference Stereoscopic Video Quality Assessment Based on Spatial-Temporal Statistics
    Zhang, Jiufa
    Liu, Lixiong
    Gong, Jiachao
    Huang, Hua
    [J]. IMAGE AND GRAPHICS, ICIG 2019, PT III, 2019, 11903 : 83 - 94
  • [2] NO-REFERENCE VIDEO QUALITY ASSESSMENT VIA FEATURE LEARNING
    Xu, Jingtao
    Ye, Peng
    Liu, Yong
    Doermann, David
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 491 - 495
  • [3] Deep Spatial-Temporal Joint Feature Representation for Video Object Detection
    Zhao, Baojun
    Zhao, Boya
    Tang, Linbo
    Han, Yuqi
    Wang, Wenzheng
    [J]. SENSORS, 2018, 18 (03)
  • [4] No-Reference Video Quality Assessment Based on the Temporal Pooling of Deep Features
    Domonkos Varga
    [J]. Neural Processing Letters, 2019, 50 : 2595 - 2608
  • [5] No-Reference Video Quality Assessment Based on the Temporal Pooling of Deep Features
    Varga, Domonkos
    [J]. NEURAL PROCESSING LETTERS, 2019, 50 (03) : 2595 - 2608
  • [6] Spatiotemporal feature learning for no-reference gaming content video quality assessment
    Kwong, Ngai-Wing
    Chan, Yui-Lam
    Tsang, Sik-Ho
    Huang, Ziyin
    Lam, Kin-Man
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
  • [7] No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
    Kossi, Koffi
    Coulombe, Stephane
    Desrosiers, Christian
    Gagnon, Ghyslain
    [J]. IEEE ACCESS, 2022, 10 : 41010 - 41022
  • [8] Deep Learning Approach for No-Reference Screen Content Video Quality Assessment
    Kwong, Ngai-Wing
    Chan, Yui-Lam
    Tsang, Sik-Ho
    Huang, Ziyin
    Lam, Kin-Man
    [J]. IEEE TRANSACTIONS ON BROADCASTING, 2024, 70 (02) : 555 - 569
  • [9] No-reference Video Quality Assessment Based on Spatio-temporal Perception Feature Fusion
    Tan, Yaya
    Kong, Guangqian
    Duan, Xun
    Long, Huiyun
    Wu, Yun
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (02) : 1317 - 1335
  • [10] No-reference Video Quality Assessment Based on Spatio-temporal Perception Feature Fusion
    Yaya Tan
    Guangqian Kong
    Xun Duan
    Huiyun Long
    Yun Wu
    [J]. Neural Processing Letters, 2023, 55 : 1317 - 1335