VMG: Rethinking U-Net Architecture for Video Super-Resolution

被引:0
|
作者
Tang, Jun [1 ]
Niu, Lele [1 ]
Liu, Linlin [1 ]
Dai, Hang [2 ]
Ding, Yong [1 ]
机构
[1] Zhejiang Univ, Coll Integrated Circuits, Hangzhou 310000, Peoples R China
[2] Univ Glasgow, Sch Comp Sci, Glasgow G12 8QQ, Scotland
关键词
Computer architecture; Data mining; Superresolution; Mixers; Feature extraction; Computational modeling; Transformers; Logic gates; Correlation; Decoding; Video super-resolution; U-Net architecture; spatial-temporal; complexity; MLP;
D O I
10.1109/TBC.2024.3486967
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The U-Net architecture has exhibited significant efficacy across various vision tasks, yet its adaptation for Video Super-Resolution (VSR) remains underexplored. While the Video Restoration Transformer (VRT) introduced U-Net into the VSR domain, it poses challenges due to intricate design and substantial computational overhead. In this paper, we present VMG, a streamlined framework tailored for VSR. Through empirical analysis, we identify the crucial stages of the U-Net architecture contributing to performance enhancement in VSR tasks. Our optimized architecture substantially reduces model parameters and complexity while improving performance. Additionally, we introduce two key modules, namely the Gated MLP-like Mixer (GMM) and the Flow-Guided cross-attention Mixer (FGM), designed to enhance spatial and temporal feature aggregation. GMM dynamically encodes spatial correlations with linear complexity in space and time, and FGM leverages optical flow to capture motion variation and implement sparse attention to efficiently aggregate temporally related information. Extensive experiments demonstrate that VMG achieves nearly 70% reduction in GPU memory usage, 30% fewer parameters, and 10% lower computational complexity (FLOPs) compared to VRT, while yielding highly competitive or superior results across four benchmark datasets. Qualitative assessments reveal VMG's ability to preserve remarkable details and sharp structures in the reconstructed videos.
引用
收藏
页码:334 / 349
页数:16
相关论文
共 50 条
  • [31] DBU-Net: Dual-Branch U-Net for Retinal Fundus Image Super-Resolution Under Complex Degradation Conditions
    Chen, Xianghui
    Qiu, Shi
    Wang, Yue
    Zhang, Yu
    Liu, Zhaoyan
    Wang, Xinhong
    Yao, Weiyuan
    Cheng, Hongjia
    Wang, Feihong
    Shu, Zhan
    Li, Xuesong
    IEEE ACCESS, 2025, 13 : 6237 - 6249
  • [32] U-Net Based Deep Regression Network Architecture for Single Image Super Resolution of License Plate Image
    Karthick, S.
    Muthukumaran, N.
    SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 2, SMARTCOM 2024, 2024, 946 : 311 - 321
  • [33] Deep Neural Networks for Image Super-Resolution in Optical Microscopy by Using Modified Hybrid Task Cascade U-Net
    Gong, Dawei
    Ma, Tengfei
    Evans, Julian
    He, Sailing
    PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER, 2021, 171 : 185 - 199
  • [34] Video super-resolution with inverse recurrent net and hybrid local fusion
    Li, Dingyi
    Wang, Zengfu
    Yang, Jian
    NEUROCOMPUTING, 2022, 489 : 40 - 51
  • [35] Omniscient Video Super-Resolution
    Yi, Peng
    Wang, Zhongyuan
    Jiang, Kui
    Jiang, Junjun
    Lu, Tao
    Tian, Xin
    Ma, Jiayi
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 4409 - 4418
  • [36] Sharp dense U-Net: an enhanced dense U-Net architecture for nucleus segmentation
    Senapati, Pradip
    Basu, Anusua
    Deb, Mainak
    Dhal, Krishna Gopal
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (06) : 2079 - 2094
  • [37] Semi-Dense U-Net: A Novel U-Net Architecture for Face Detection
    Pai, Ganesh
    Kumari, M. Sharmila
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 406 - 414
  • [38] AS3ITransUNet: Spatial-Spectral Interactive Transformer U-Net With Alternating Sampling for Hyperspectral Image Super-Resolution
    Xu, Qin
    Liu, Shiji
    Wang, Jiahui
    Jiang, Bo
    Tang, Jin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [39] Single image multi-scale enhancement for rock Micro-CT super-resolution using residual U-Net
    Shan, Liqun
    Liu, Chengqian
    Liu, Yanchang
    Tu, Yazhou
    Chilukoti, Sai Venkatesh
    Hei, Xiali
    APPLIED COMPUTING AND GEOSCIENCES, 2024, 22
  • [40] Super-Resolution Residual U-Net Model for the Reconstruction o Limited-Data Tunable Diode Laser Absorption Tomography
    Chen, Shaogang
    Hao, Xiaojian
    Pan, Baowu
    Huang, Xiaodong
    ACS OMEGA, 2022, 7 (22): : 18722 - 18731