Cross-scale hierarchical spatio-temporal transformer for video enhancement

被引:0
|
作者
Jiang, Qin [1 ,2 ,3 ]
Wang, Qinglin [1 ,2 ,3 ]
Chi, Lihua [4 ]
Liu, Jie [1 ,2 ,3 ]
机构
[1] Natl Univ Def Technol, Changsha, Peoples R China
[2] Lab Digitizing Software Frontier Equipment, Changsha, Peoples R China
[3] Sci & Technol Parallel & Distributed Proc Lab, Changsha, Peoples R China
[4] Hunan GuoKe Computil Technol Co Ltd, Changsha, Peoples R China
关键词
Video super-resolution; Denoising; Deblurring; Transformer; Temporal;
D O I
10.1016/j.knosys.2024.112773
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The diversity and complexity of degradations in low-quality videos pose non-trivial challenges on video enhancement to reconstruct the high-quality counterparts. Prevailing sliding window based methods represent poor performance due to the limitation of window size. Recurrent networks take advantage of long-term modeling to aggregate more information, resulting insignificant performance improvements. However, most of them are trained on simple degraded data and can only tackle specific degradation. To break through the limitation, we propose a progressive alignment network, namely Cross-scale Hierarchical Spatio-Temporal Transformer (CHSTT), which leverages cross-scale tokenization to generate multi-scale visual tokens in the entire sequence to capture extensive long-range temporal dependencies. To enhance the spatial and temporal interactions, we introduce an innovative hierarchical Transformer, facilitating the computation of mutual multi-head attention across both spatial and temporal dimensions. Quantitative and qualitative assessments substantiate the superior performance of CHSTT compared to several state-of-the-art benchmarks across three distinct video enhancement tasks, including video super-resolution, video denoising, and video deblurring.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Spatio-temporal propagation and reconstruction for low-light video enhancement
    Ye, Jing
    Qiu, Changzhen
    Zhang, Zhiyong
    DIGITAL SIGNAL PROCESSING, 2023, 139
  • [32] Spatio-Temporal Information Fusion Network for Compressed Video Quality Enhancement
    Huang, Weiwei
    Jia, Kebin
    Liu, Pengyu
    Yu, Yuan
    2023 DATA COMPRESSION CONFERENCE, DCC, 2023, : 343 - 343
  • [33] Spatio-Temporal Detail Information Retrieval for Compressed Video Quality Enhancement
    Luo, Dengyan
    Ye, Mao
    Li, Shuai
    Zhu, Ce
    Li, Xue
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6808 - 6820
  • [34] Spatio-temporal progressive optimization network for video bit depth enhancement
    Li, Qingying
    Lin, Xin
    Liu, Jing
    Su, Yuting
    Ma, Rui
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [35] Dual-frame spatio-temporal feature modulation for video enhancement
    Patil, Prashant W.
    Gupta, Sunil
    Rana, Santu
    Venkatesh, Svetha
    PATTERN RECOGNITION, 2022, 130
  • [36] Spatio-Temporal Catcher: a Self-Supervised Transformer for Deepfake Video Detection
    Li, Maosen
    Li, Xurong
    Yu, Kun
    Deng, Cheng
    Huang, Heng
    Mao, Feng
    Xue, Hui
    Li, Minghao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8707 - 8718
  • [37] Hierarchical Object-oriented Spatio-Temporal Reasoning for Video Question Answering
    Dang, Long Hoang
    Le, Thao Minh
    Le, Vuong
    Tran, Truyen
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 636 - 642
  • [38] Spatio-Temporal Deep Residual Network with Hierarchical Attentions for Video Event Recognition
    Li, Yonggang
    Liu, Chunping
    Ji, Yi
    Gong, Shengrong
    Xu, Haibao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (02)
  • [39] HIERARCHICAL ACTIVITY DISCOVERY WITHIN SPATIO-TEMPORAL CONTEXT FOR VIDEO ANOMALY DETECTION
    Xu, Dan
    Wu, Xinyu
    Song, Dezhen
    Li, Nannan
    Chen, Yen-Lun
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 3597 - 3601
  • [40] Hierarchical Spatio-Temporal Graph Convolutional Networks and Transformer Network for Traffic Flow Forecasting
    Huo, Guangyu
    Zhang, Yong
    Wang, Boyue
    Gao, Junbin
    Hu, Yongli
    Yin, Baocai
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (04) : 3855 - 3867