Deep video compression based on Long-range Temporal Context Learning

被引:0
|
作者
Wu, Kejun [1 ]
Li, Zhenxing [1 ]
Yang, You [1 ]
Liu, Qiong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China
关键词
Deep learning; Video compression; Computational photography; Temporal context learning;
D O I
10.1016/j.cviu.2024.104127
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video compression allows for efficient storage and transmission of data, benefiting imaging and vision applications, e.g. computational imaging, photography, and displays by delivering high-quality videos. To exploit more informative contexts of video, we propose DVCL, a novel D eep V ideo C ompression based on L ong-range Temporal Context Learning. Aiming at high coding performance, this new compression paradigm makes full use of long-range temporal correlations derived from multiple reference frames to learn richer contexts. Motion vectors (MVs) are estimated to represent the motion relations of videos. By employing MVs, a long-range temporal context learning (LTCL) module is presented to extract context information from multiple reference frames, such that a more accurate and informative temporal contexts can be learned and constructed. The long-range temporal contexts serve as conditions and generate the predicted frames by contextual encoder and decoder. To address the challenge of imbalanced training, we develop a multi-stage training strategy to ensure the whole DVCL framework is trained progressively and stably. Extensive experiments demonstrate the proposed DVCL achieves the highest objective and subjective quality, while maintaining relatively low complexity. Specifically, 25.30% and 45.75% bitrate savings on average can be obtained than x265 codec at the same PSNR and MS-SSIM, respectively.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] AirTrack: Onboard Deep Learning Framework for Long-Range Aircraft Detection and Tracking
    Ghosh, Sourish
    Patrikar, Jay
    Moon, Brady
    Hamidi, Milad Moghassem
    Scherer, Sebastian
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 1277 - 1283
  • [42] Multi-semantic long-range dependencies capturing for efficient video representation learning
    Duan, Jinhao
    Xu, Hua
    Lin, Xiaozhu
    Zhu, Shangchao
    Du, Yuanze
    IMAGE AND VISION COMPUTING, 2020, 104
  • [43] LONG-RANGE REFRACTION EXPERIMENTS IN DEEP OCEAN
    ASADA, T
    SHIMAMURA, H
    TECTONOPHYSICS, 1979, 56 (1-2) : 67 - 82
  • [44] LONG-RANGE SOUND PROPAGATION IN THE DEEP OCEAN
    HALE, FE
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1959, 31 (11): : 1572 - 1572
  • [45] LONG-RANGE PLANNING FOR DEEP SPACE NETWORK
    RECHTIN, E
    ASTRONAUTICS & AERONAUTICS, 1968, 6 (01): : 28 - &
  • [46] LONG-RANGE UNDERWATER PHOTOGRAPHY IN THE DEEP OCEAN
    HUGGETT, Q
    MARINE GEOPHYSICAL RESEARCHES, 1990, 12 (1-2) : 69 - 81
  • [47] Wavelet-based estimation of long-range dependence in MPEG video traces
    Cackov, N
    Lucic, Z
    Bogdanov, M
    Trajkovic, L
    2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 2068 - 2071
  • [48] LONG-RANGE SOUND PROPAGATION IN DEEP OCEAN
    HALE, FE
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1961, 33 (04): : 456 - &
  • [49] Wavelet-based estimation of long-range dependence in mpeg video traces
    Cackov, N. (ncackov@cs.sfu.ca), Circuits and Systems Society, IEEE CASS; Science Council of Japan; The Inst. of Electronics, Inf. and Communication Engineers, IEICE; The Institute of Electrical and Electronics Engineers, Inc., IEEE (Institute of Electrical and Electronics Engineers Inc.):
  • [50] Deep Learning Based Video Compression Techniques with Future Research Issues
    Joy, Helen K. K.
    Kounte, Manjunath R. R.
    Chandrasekhar, Arunkumar
    Paul, Manoranjan
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 131 (04) : 2599 - 2625