Deep Learning-Based Video Coding: A Review and a Case Study

被引:91
|
作者
Liu, Dong [1 ]
Li, Yue [1 ]
Lin, Jianping [1 ]
Li, Houqiang [1 ]
Wu, Feng [1 ]
机构
[1] Univ Sci & Technol China, CAS Key Lab Technol Geospatial Informat Proc & Ap, 443 Huangshan Rd, Hefei 230027, Anhui, Peoples R China
关键词
Deep learning; image coding; prediction; transform; video coding; IMAGE COMPRESSION; NEURAL-NETWORK; FRAMEWORK;
D O I
10.1145/3368405
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The past decade has witnessed the great success of deep learning in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. We review the representative works about using deep learning for image/video coding, an actively developing research area since 2015. We divide the related works into two categories: new coding schemes that are built primarily upon deep networks, and deep network-based coding tools that shall be used within traditional coding schemes. For deep schemes, pixel probability modeling and auto-encoder are the two approaches, that can be viewed as predictive coding and transform coding, respectively. For deep tools, there have been several techniques using deep learning to perform intra-picture prediction, inter-picture prediction, cross-channel prediction, probability distribution prediction, transform, post- or in-loop filtering, down- and up-sampling, as well as encoding optimizations. In the hope of advocating the research of deep learning-based video coding, we present a case study of our developed prototype video codec, Deep Learning Video Coding (DLVC). DLVC features two deep tools that are both based on convolutional neural network (CNN), namely CNN-based in-loop filter and CNN-based block adaptive resolution coding. The source code of DLVC has been released for future research.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] Deep learning-based Feature compression for Video Coding for Machine
    Do, Jihoon
    Lee, Jooyoung
    Kim, Younhee
    Jeong, Se Yoon
    Choi, Jin Soo
    [J]. INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2022, 2022, 12177
  • [2] Learning-Based Video Coding with Joint Deep Compression and Enhancement
    Zhao, Tiesong
    Feng, Weize
    Zeng, Hongji
    Xu, Yiwen
    Niu, Yuzhen
    Liu, Jiaying
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3045 - 3054
  • [3] Deep learning-based video quality enhancement for the new versatile video coding
    Soulef Bouaafia
    Randa Khemiri
    Seifeddine Messaoud
    Olfa Ben Ahmed
    Fatma Ezahra Sayadi
    [J]. Neural Computing and Applications, 2022, 34 : 14135 - 14149
  • [4] Deep learning-based video quality enhancement for the new versatile video coding
    Bouaafia, Soulef
    Khemiri, Randa
    Messaoud, Seifeddine
    Ben Ahmed, Olfa
    Sayadi, Fatma Ezahra
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17): : 14135 - 14149
  • [5] Review of Deep Learning-Based Video Anomaly Detection
    Ji, Genlin
    Qi, Xiaosha
    Wang, Jiaqi
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2024, 37 (02): : 128 - 143
  • [6] Deep learning-based video coding optimisation of H.265
    Karthikeyan, C.
    Vivek, Tammineedi Venkata Satya
    Narayanan, S. Lakshmi
    Markkandan, S.
    Babu, D. Vijendra
    Laddha, Shilpa
    [J]. INTERNATIONAL JOURNAL OF ENGINEERING SYSTEMS MODELLING AND SIMULATION, 2023, 14 (01) : 52 - 57
  • [7] Deep Learning-Based Chroma Prediction for Intra Versatile Video Coding
    Zhu, Linwei
    Zhang, Yun
    Wang, Shiqi
    Kwong, Sam
    Jin, Xin
    Qiao, Yu
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (08) : 3168 - 3181
  • [8] Deep Learning-Based Intra Mode Derivation for Versatile Video Coding
    Zhu, Linwei
    Zhang, Yun
    Li, Na
    Jiang, Gangyi
    Kwong, Sam
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (02)
  • [9] Deep Learning-Based Luma and Chroma Fractional Interpolation in Video Coding
    Pham, Chi Do-Kim
    Zhou, Jinjia
    [J]. IEEE ACCESS, 2019, 7 : 112535 - 112543
  • [10] Learning-based Multiview Video Coding
    Bai, Baochun
    Cheng, Li
    Lei, Cheng
    Boulanger, Pierre
    Harms, Janelle
    [J]. PCS: 2009 PICTURE CODING SYMPOSIUM, 2009, : 201 - +