VMG: Rethinking U-Net Architecture for Video Super-Resolution

被引：0

作者：

Tang, Jun ^{[1
]}

Niu, Lele ^{[1
]}

Liu, Linlin ^{[1
]}

Dai, Hang ^{[2
]}

Ding, Yong ^{[1
]}

机构：

[1] Zhejiang Univ, Coll Integrated Circuits, Hangzhou 310000, Peoples R China

[2] Univ Glasgow, Sch Comp Sci, Glasgow G12 8QQ, Scotland

来源：

IEEE TRANSACTIONS ON BROADCASTING | 2025年 / 71卷 / 01期

关键词：

Computer architecture; Data mining; Superresolution; Mixers; Feature extraction; Computational modeling; Transformers; Logic gates; Correlation; Decoding; Video super-resolution; U-Net architecture; spatial-temporal; complexity; MLP;

D O I：

10.1109/TBC.2024.3486967

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The U-Net architecture has exhibited significant efficacy across various vision tasks, yet its adaptation for Video Super-Resolution (VSR) remains underexplored. While the Video Restoration Transformer (VRT) introduced U-Net into the VSR domain, it poses challenges due to intricate design and substantial computational overhead. In this paper, we present VMG, a streamlined framework tailored for VSR. Through empirical analysis, we identify the crucial stages of the U-Net architecture contributing to performance enhancement in VSR tasks. Our optimized architecture substantially reduces model parameters and complexity while improving performance. Additionally, we introduce two key modules, namely the Gated MLP-like Mixer (GMM) and the Flow-Guided cross-attention Mixer (FGM), designed to enhance spatial and temporal feature aggregation. GMM dynamically encodes spatial correlations with linear complexity in space and time, and FGM leverages optical flow to capture motion variation and implement sparse attention to efficiently aggregate temporally related information. Extensive experiments demonstrate that VMG achieves nearly 70% reduction in GPU memory usage, 30% fewer parameters, and 10% lower computational complexity (FLOPs) compared to VRT, while yielding highly competitive or superior results across four benchmark datasets. Qualitative assessments reveal VMG's ability to preserve remarkable details and sharp structures in the reconstructed videos.

引用

页码：334 / 349

页数：16

共 50 条

[31] DBU-Net: Dual-Branch U-Net for Retinal Fundus Image Super-Resolution Under Complex Degradation Conditions
Chen, Xianghui
Qiu, Shi
Wang, Yue
Zhang, Yu
Liu, Zhaoyan
Wang, Xinhong
Yao, Weiyuan
Cheng, Hongjia
Wang, Feihong
Shu, Zhan
Li, Xuesong
IEEE ACCESS, 2025, 13 : 6237 - 6249
[32] U-Net Based Deep Regression Network Architecture for Single Image Super Resolution of License Plate Image
Karthick, S.
Muthukumaran, N.
SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 2, SMARTCOM 2024, 2024, 946 : 311 - 321
[33] Deep Neural Networks for Image Super-Resolution in Optical Microscopy by Using Modified Hybrid Task Cascade U-Net
Gong, Dawei
Ma, Tengfei
Evans, Julian
He, Sailing
PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER, 2021, 171 : 185 - 199
[34] Video super-resolution with inverse recurrent net and hybrid local fusion
Li, Dingyi
Wang, Zengfu
Yang, Jian
NEUROCOMPUTING, 2022, 489 : 40 - 51
[35] Omniscient Video Super-Resolution
Yi, Peng
Wang, Zhongyuan
Jiang, Kui
Jiang, Junjun
Lu, Tao
Tian, Xin
Ma, Jiayi
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 4409 - 4418
[36] Sharp dense U-Net: an enhanced dense U-Net architecture for nucleus segmentation
Senapati, Pradip
Basu, Anusua
Deb, Mainak
Dhal, Krishna Gopal
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (06) : 2079 - 2094
[37] Semi-Dense U-Net: A Novel U-Net Architecture for Face Detection
Pai, Ganesh
Kumari, M. Sharmila
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 406 - 414
[38] AS3ITransUNet: Spatial-Spectral Interactive Transformer U-Net With Alternating Sampling for Hyperspectral Image Super-Resolution
Xu, Qin
Liu, Shiji
Wang, Jiahui
Jiang, Bo
Tang, Jin
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[39] Single image multi-scale enhancement for rock Micro-CT super-resolution using residual U-Net
Shan, Liqun
Liu, Chengqian
Liu, Yanchang
Tu, Yazhou
Chilukoti, Sai Venkatesh
Hei, Xiali
APPLIED COMPUTING AND GEOSCIENCES, 2024, 22
[40] Super-Resolution Residual U-Net Model for the Reconstruction o Limited-Data Tunable Diode Laser Absorption Tomography
Chen, Shaogang
Hao, Xiaojian
Pan, Baowu
Huang, Xiaodong
ACS OMEGA, 2022, 7 (22): : 18722 - 18731

← 1 2 3 4 5 →