Spatio-temporal feature learning for enhancing video quality based on screen content characteristics

被引:0
|
作者
Huang, Ziyin [1 ]
Chan, Yui-Lam [1 ]
Tsang, Sik-Ho [1 ]
Kwong, Ngai-Wing [1 ]
Lam, Kin-Man [1 ]
Ling, Wing-Kuen [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Hong Kong, Peoples R China
[2] Guangdong Univ Technol, Sch Informat Engn, Guangzhou, Guangdong, Peoples R China
关键词
Screen content video; Quality enhancement; Deep learning; HEVC;
D O I
10.1016/j.jvcir.2024.104270
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rising demands for remote desktops and online meetings, screen content videos have drawn significant attention. Different from natural videos, screen content videos often exhibit scene switches where the content abruptly changes from one frame to the next. These scene switches result in obvious distortions in compressed videos. Besides, frame freezing, where the content remains unchanged for a certain duration, is also very common in screen content videos. Existing alignment-based models struggle to effectively enhance scene switch frames and lack efficiency when dealing with frame freezing situations. Therefore, we propose a novel alignment-free method that effectively handles both scene switches and frame freezing. In our approach, we develop a spatial and temporal feature extraction module that compresses and extracts spatio-temporal information from three groups of frame inputs. This enables efficient handling of scene switches. In addition, an edge aware block is proposed for extracting edge information, which guides the model to focus on restoring the high-frequency components in frame freezing situations. The fusion module is then designed to adaptively fuse the features from three groups, considering different positions of video frames, to enhance frames during scene switch and frame freezing scenarios. Experimental results demonstrate the significant advancements achieved by the proposed edge aware with spatio-temporal information fusion network (EAST) in enhancing the quality of compressed videos, surpassing the current state-of-the-art methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Deconfounded Multimodal Learning for Spatio-temporal Video Grounding
    Wang, Jiawei
    Ma, Zhanchang
    Cao, Da
    Le, Yuquan
    Xiao, Junbin
    Chua, Tat-Seng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7521 - 7529
  • [42] Spatio-temporal transform based video hashing
    Coskun, Baris
    Sankur, Bulent
    Memon, Nasir
    IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (06) : 1190 - 1208
  • [43] Video representation learning by identifying spatio-temporal transformations
    Sheng Geng
    Shimin Zhao
    Hu Liu
    Applied Intelligence, 2022, 52 : 6613 - 6622
  • [44] Learning Spatio-Temporal Downsampling for Effective Video Upscaling
    Xiang, Xiaoyu
    Tian, Yapeng
    Rengarajan, Vijay
    Young, Lucas D.
    Zhu, Bo
    Ranjan, Rakesh
    COMPUTER VISION - ECCV 2022, PT XVIII, 2022, 13678 : 162 - 181
  • [45] Video representation learning by identifying spatio-temporal transformations
    Geng, Sheng
    Zhao, Shimin
    Liu, Hu
    APPLIED INTELLIGENCE, 2022, 52 (06) : 6613 - 6622
  • [46] Learning Spatio-Temporal Sharpness Map for Video Deblurring
    Zhu, Qi
    Zheng, Naishan
    Huang, Jie
    Zhou, Man
    Zhang, Jinghao
    Zhao, Feng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3957 - 3970
  • [47] SPATIO-TEMPORAL MID-LEVEL FEATURE BANK FOR ACTION RECOGNITION IN LOW QUALITY VIDEO
    Rahman, Saimunur
    See, John
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 1846 - 1850
  • [48] Bidirectionally Learning Dense Spatio-temporal Feature Propagation Network for Unsupervised Video Object Segmentation
    Fan, Jiaqing
    Su, Tiankang
    Zhang, Kaihua
    Liu, Qingshan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3646 - 3655
  • [49] CONTENT ADAPTIVE VIDEO SUMMARIZATION USING SPATIO-TEMPORAL FEATURES
    Nam, Hyunwoo
    Yoo, Chang D.
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 4003 - 4007
  • [50] Low-Complexity Video Quality Assessment Based on Spatio-Temporal Structure
    Lu, Yaqi
    Yu, Mei
    Jiang, Gangyi
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2019, 2019, 1078 : 408 - 415