One-for-All: An Efficient Variable Convolution Neural Network for In-Loop Filter of VVC

被引:26
|
作者
Huang, Zhijie [1 ]
Sun, Jun [1 ]
Guo, Xiaopeng [1 ]
Shang, Mingyu [1 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing 100871, Peoples R China
关键词
Encoding; Videos; Feature extraction; Convolution; Adaptation models; Visualization; Training; Variable; in-loop filter; attention; versatile video coding (VVC); CNN;
D O I
10.1109/TCSVT.2021.3089498
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, many researches on convolution neural network (CNN) based in-loop filters have been proposed to improve coding efficiency. However, most existing CNN based filters tend to train and deploy multiple networks for various quantization parameters (QP) and frame types (FT), which drastically increases resources in training these models and the memory burdens for video codec. In this paper, we propose a novel variable CNN (VCNN) based in-loop filter for VVC, which can effectively handle the compressed videos with different QPs and FTs via a single model. Specifically, an efficient and flexible attention module is developed to recalibrate features according to QPs or FTs. Then we embed the module into the residual block so that these informative features can be continuously utilized in the residual learning process. To minimize the information loss in the learning process of the entire network, we utilize a residual feature aggregation module (RFA) for more efficient feature extraction. Based on it, an efficient network architecture VCNN is designed that can not only effectively reduce compression artifacts, but also can be adaptive to various QPs and FTs. To address training data imbalance on various QPs and FTs and improve the robustness of the model, a focal mean square error loss function is employed to train the proposed network. Then we integrate the VCNN into VVC as an additional tool of in-loop filters after the deblocking filter. Extensive experimental results show that our VCNN approach obtains on average 3.63%, 4.36%, 4.23%, 3.56% under all intra, low-delay P, low-delay, and random access configurations, respectively, which is even better than QP-Separate models.
引用
收藏
页码:2342 / 2355
页数:14
相关论文
共 50 条
  • [21] PerFedRLNAS: One-for-All Personalized Federated Neural Architecture Search
    Yao, Dixi
    Li, Baochun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 16398 - 16406
  • [22] Magnitude and Similarity Based Variable Rate Filter Pruning for Efficient Convolution Neural Networks
    Ghimire, Deepak
    Kim, Seong-Heum
    APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [23] A Reconfigurable Framework for Neural Network Based Video In-Loop Filtering
    Zhang, Yichi
    Ding, Dandan
    Ma, Zhan
    Li, Zhu
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (06)
  • [24] Multi-stage Locally and Long-range Correlated Feature Fusion for Learned In-loop Filter in VVC
    Kathariya, Birendra
    Li, Zhu
    Wang, Hongtao
    Van der Auwera, Geert
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [25] Multi-Stage Spatial and Frequency Feature Fusion using Transformer in CNN-Based In-Loop Filter for VVC
    Kathariya, Birendra
    Li, Zhu
    Wang, Hongtao
    Coban, Mohammad
    2022 PICTURE CODING SYMPOSIUM (PCS), 2022, : 373 - 377
  • [26] Unsupervised pre-trained filter learning approach for efficient convolution neural network
    Rehman, Sadaqat Ur
    Tu, Shanshan
    Waqas, Muhammad
    Huang, Yongfeng
    Rehman, Obaid Ur
    Ahmad, Basharat
    Ahmad, Salman
    NEUROCOMPUTING, 2019, 365 : 171 - 190
  • [27] Optimize neural network based in-loop filters through iterative training
    Wang, Liqiang
    Xu, Xiaozhong
    Liu, Shan
    2022 PICTURE CODING SYMPOSIUM (PCS), 2022, : 367 - 371
  • [28] VarKFaceNet: An Efficient Variable Depthwise Convolution Kernels Neural Network for Lightweight Face Recognition
    Ma, Qinghua
    Zhang, Peng
    Cui, Min
    IEEE ACCESS, 2024, 12 : 117472 - 117482
  • [29] Improved method of deblocking filter based on convolutional neural network in VVC
    Yang, Jing
    Du, Biao
    Tang, Tong
    2020 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC), 2020, : 764 - 769
  • [30] Spatial-Temporal Residue Network Based In-Loop Filter for Video Coding
    Jia, Chuanmin
    Wang, Shiqi
    Zhang, Xinfeng
    Wang, Shanshe
    Ma, Siwei
    2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2017,