STFE-VC: Spatio-temporal feature enhancement for learned video compression

被引:0
|
作者
Wang, Yiming [1 ]
Huang, Qian [1 ,3 ]
Tang, Bin [1 ]
Li, Xin [1 ]
Li, Xing [2 ]
机构
[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Forestry Univ, Coll Informat Sci & Technol, Nanjing, Peoples R China
[3] Changzhou Univ, Jiangsu Engn Res Ctr Digital Twinning Technol, Key Equipment Petrochem Proc, Changzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Spatio-temporal feature enhancement; Learned video compression; Spatio-temporal motion enhancement; In-loop filtering enhancement;
D O I
10.1016/j.eswa.2025.126682
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the increasing growth of video data, limited bandwidth and hardware resource constraints demand more efficient video compression. Current learned video compression methods have shown promising performance. However, these methods mainly rely on the optical flow networks to perform temporal prediction, which may suffer from inaccurate motion estimation and introduce extra artifacts to reconstructed frames. In this paper, we propose a spatio-temporal feature enhancement method for learned video compression to better model the inter-frame motion patterns and reduce compression artifacts. Specifically, we introduce a spatio-temporal motion enhancement module that further extracts the feature representation of original motion vector to enhance corresponding spatial and temporal components. Then, we introduce an in-loop filtering enhancement module that employs cascaded residual blocks to progressively enhance feature textures and provide higher- quality temporal domain reference signals for subsequent reconstruction. More importantly, our proposed method can be integrated into the widely-used residual coding and contextual coding schemes. Comprehensive experiments demonstrate that our integrated methods are superior to the previous learned methods on JCTVC, UVG and MCL-JCV benchmark datasets. In addition, our integrated methods also outperform the latest generalized video coding standard (H.266/VVC) by a larger margin in terms of MS-SSIM metric.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Neural Video Compression with Spatio-Temporal Cross-Covariance Transformers
    Chen, Zhenghao
    Relic, Lucas
    Azevedo, Roberto
    Zhang, Yang
    Gross, Markus
    Xu, Dong
    Zhou, Luping
    Schroers, Christopher
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8543 - 8551
  • [22] A JOINT SPATIO-TEMPORAL FILTERING APPROACH TO EFFICIENT PREDICTION IN VIDEO COMPRESSION
    Chen, Yue
    Han, Jingning
    Nanjundaswamy, Tejaswi
    Rose, Kenneth
    2013 PICTURE CODING SYMPOSIUM (PCS), 2013, : 81 - 84
  • [23] Video Action Recognition Based on Spatio-temporal Feature Pyramid Module
    Gong, Suming
    Chen, Ying
    2020 13TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2020), 2020, : 338 - 341
  • [24] Feature Pooling Using Spatio-Temporal Constrain for Video Summarization and Retrieval
    Ren, Jie
    Ren, Jinchang
    ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING: FUTURETECH & MUE, 2016, 393 : 381 - 387
  • [25] Spatio-temporal compression for semi-supervised video object segmentation
    Chuanjun Ji
    Yadang Chen
    Zhi-Xin Yang
    Enhua Wu
    The Visual Computer, 2023, 39 : 4929 - 4942
  • [26] Spatio-temporal constrained tone mapping operator for HDR video compression
    Ozcinar, Cagri
    Lauga, Paul
    Valenzise, Giuseppe
    Dufaux, Frederic
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 55 : 166 - 178
  • [27] Incremetal spatio-temporal feature extraction and retrieval for large video database
    Geng, Bo
    Lu, Hong
    Xue, Xiangyang
    2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 961 - 964
  • [28] Deep video action clustering via spatio-temporal feature learning
    Peng, Bo
    Lei, Jianjun
    Fu, Huazhu
    Jia, Yalong
    Zhang, Zongqian
    Li, Yi
    NEUROCOMPUTING, 2021, 456 : 519 - 527
  • [29] Interactive spatio-temporal feature learning network for video foreground detection
    Hongrui Zhang
    Huan Li
    Complex & Intelligent Systems, 2022, 8 : 4251 - 4263
  • [30] Guest Editorial: Spatio-temporal Feature Learning for Unconstrained Video Analysis
    Han, Yahong
    Nie, Liqiang
    Wu, Fei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29209 - 29211