STFE-VC: Spatio-temporal feature enhancement for learned video compression

被引：0

作者：

Wang, Yiming ^{[1
]}

Huang, Qian ^{[1
,3
]}

Tang, Bin ^{[1
]}

Li, Xin ^{[1
]}

Li, Xing ^{[2
]}

机构：

[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing, Jiangsu, Peoples R China

[2] Nanjing Forestry Univ, Coll Informat Sci & Technol, Nanjing, Peoples R China

[3] Changzhou Univ, Jiangsu Engn Res Ctr Digital Twinning Technol, Key Equipment Petrochem Proc, Changzhou, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2025年 / 272卷

基金：

中国国家自然科学基金;

关键词：

Spatio-temporal feature enhancement; Learned video compression; Spatio-temporal motion enhancement; In-loop filtering enhancement;

D O I：

10.1016/j.eswa.2025.126682

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the increasing growth of video data, limited bandwidth and hardware resource constraints demand more efficient video compression. Current learned video compression methods have shown promising performance. However, these methods mainly rely on the optical flow networks to perform temporal prediction, which may suffer from inaccurate motion estimation and introduce extra artifacts to reconstructed frames. In this paper, we propose a spatio-temporal feature enhancement method for learned video compression to better model the inter-frame motion patterns and reduce compression artifacts. Specifically, we introduce a spatio-temporal motion enhancement module that further extracts the feature representation of original motion vector to enhance corresponding spatial and temporal components. Then, we introduce an in-loop filtering enhancement module that employs cascaded residual blocks to progressively enhance feature textures and provide higher- quality temporal domain reference signals for subsequent reconstruction. More importantly, our proposed method can be integrated into the widely-used residual coding and contextual coding schemes. Comprehensive experiments demonstrate that our integrated methods are superior to the previous learned methods on JCTVC, UVG and MCL-JCV benchmark datasets. In addition, our integrated methods also outperform the latest generalized video coding standard (H.266/VVC) by a larger margin in terms of MS-SSIM metric.

引用

页数：13

共 50 条

[21] Neural Video Compression with Spatio-Temporal Cross-Covariance Transformers
Chen, Zhenghao
Relic, Lucas
Azevedo, Roberto
Zhang, Yang
Gross, Markus
Xu, Dong
Zhou, Luping
Schroers, Christopher
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8543 - 8551
[22] A JOINT SPATIO-TEMPORAL FILTERING APPROACH TO EFFICIENT PREDICTION IN VIDEO COMPRESSION
Chen, Yue
Han, Jingning
Nanjundaswamy, Tejaswi
Rose, Kenneth
2013 PICTURE CODING SYMPOSIUM (PCS), 2013, : 81 - 84
[23] Video Action Recognition Based on Spatio-temporal Feature Pyramid Module
Gong, Suming
Chen, Ying
2020 13TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2020), 2020, : 338 - 341
[24] Feature Pooling Using Spatio-Temporal Constrain for Video Summarization and Retrieval
Ren, Jie
Ren, Jinchang
ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING: FUTURETECH & MUE, 2016, 393 : 381 - 387
[25] Spatio-temporal compression for semi-supervised video object segmentation
Chuanjun Ji
Yadang Chen
Zhi-Xin Yang
Enhua Wu
The Visual Computer, 2023, 39 : 4929 - 4942
[26] Spatio-temporal constrained tone mapping operator for HDR video compression
Ozcinar, Cagri
Lauga, Paul
Valenzise, Giuseppe
Dufaux, Frederic
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 55 : 166 - 178
[27] Incremetal spatio-temporal feature extraction and retrieval for large video database
Geng, Bo
Lu, Hong
Xue, Xiangyang
2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 961 - 964
[28] Deep video action clustering via spatio-temporal feature learning
Peng, Bo
Lei, Jianjun
Fu, Huazhu
Jia, Yalong
Zhang, Zongqian
Li, Yi
NEUROCOMPUTING, 2021, 456 : 519 - 527
[29] Interactive spatio-temporal feature learning network for video foreground detection
Hongrui Zhang
Huan Li
Complex & Intelligent Systems, 2022, 8 : 4251 - 4263
[30] Guest Editorial: Spatio-temporal Feature Learning for Unconstrained Video Analysis
Han, Yahong
Nie, Liqiang
Wu, Fei
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (22) : 29209 - 29211

← 1 2 3 4 5 →