Hybrid video coding scheme based on VVC and spatio-temporal attention convolution neural network

被引：0

作者：

He, Gang ^{[1
]}

Xu, Kepeng ^{[1
]}

Wu, Chang ^{[1
]}

Ma, Zijia ^{[1
]}

Wen, Xing ^{[2
]}

Sun, Ming ^{[2
]}

机构：

[1] Xidian Univ, Xian, Peoples R China

[2] Kuaishou Technol, Beijing, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 | 2022年

关键词：

D O I：

10.1109/CVPRW56347.2022.00193

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we propose a hybrid video coding framework. The framework is built on the basis of VVC (Versatile Video Coding) video coding standard and constructs an implicitly aligned multi-frame fusion model to accomplish subjective video quality enhancement. The proposed framework mainly optimizes video compression efficiency from two perspectives. First is the sequence-level dynamic rate control algorithm, which assigns the appropriate bitrate to each video to obtain the highest overall video quality. Second is the MAQE, a multi frame implicit alignment video quality enhancement model, which performs motion alignment through multiple convolutional kernels of different sizes, uses a residual aggregation layer to fuse features of different frames, and then uses an enhanced attention module to adaptively deflate features based on spatio-temporal contextual features, so as to more effectively fuse feature of multiple frames and obtain higher quality reconstructed frames. The proposed method is validated on two tracks of 0.1M code rate and 1M code rate on CLIC-2022 video compression task, Experimental results show that the proposed method achieves PSNR of 30.301 and 37.251 and obtains MS-SSIM of 0.9368 and 0.9875. This paper is a comprehensive presentation of the scheme used by the Night-Watch team of the CLIC-2022 video track.

引用

页码：1790 / 1793

页数：4

共 50 条

[1] Spatio-Temporal Convolution-Attention Video Network
Diba, Ali
Sharma, Vivek
Arzani, Mohammad. M.
Van Gool, Luc
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 859 - 869
[2] Spatio-temporal rate allocation for hybrid video coding
Beermann, M
[J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2003, PTS 1-3, 2003, 5150 : 222 - 230
[3] Spatio-Temporal Convolutional Neural Network for Enhanced Inter Prediction in Video Coding
Merkle, Philipp
Winken, Martin
Pfaff, Jonathan
Schwarz, Heiko
Marpe, Detlev
Wiegand, Thomas
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4738 - 4752
[4] Lightweight video super-resolution based on hybrid spatio-temporal convolution
Xia, Zhenping
Chen, Hao
Zhang, Yuning
Cheng, Cheng
Hu, Fuyuan
[J]. Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2024, 32 (16): : 2564 - 2576
[5] Spatio-Temporal Deformable Attention Network for Video Deblurring
Zhang, Huicong
Xie, Haozhe
Yao, Hongxun
[J]. COMPUTER VISION - ECCV 2022, PT XVI, 2022, 13676 : 581 - 596
[6] Spatio-temporal Attention Network for Video Instance Segmentation
Liu, Xiaoyu
Ren, Haibing
Ye, Tingmeng
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 725 - 727
[7] Spatio-Temporal Deformable Attention Network for Video Deblurring
Zhang, Huicong
Xie, Haozhe
Yao, Hongxun
[J]. arXiv, 2022,
[8] 3DCANN: A Spatio-Temporal Convolution Attention Neural Network for EEG Emotion Recognition
Liu, Shuaiqi
Wang, Xu
Zhao, Ling
Li, Bing
Hu, Weiming
Yu, Jie
Zhang, Yu-Dong
[J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5321 - 5331
[9] Video Fingerprint Algorithm Based on Spatio-Temporal Deep Neural Network
Wang Dongdong
Li Yuenan
[J]. LASER & OPTOELECTRONICS PROGRESS, 2018, 55 (01)
[10] Shared Spatio-temporal Attention Convolution Optimization Network for Traffic Prediction
Li, Pengcheng
Ke, Changjiu
Tu, Hongyu
Zhang, Houbing
Zhang, Xu
[J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (01): : 130 - 138

← 1 2 3 4 5 →