Activating More Pixels in Image Super-Resolution Transformer

被引:171
|
作者
Chen, Xiangyu [1 ,2 ,3 ]
Wang, Xintao [4 ]
Zhou, Jiantao [1 ]
Qiao, Yu [2 ,3 ]
Dong, Chao [2 ,3 ]
机构
[1] Univ Macau, State Key Lab Internet Things Smart City, Zhuhai, Peoples R China
[2] Chinese Acad Sci, Shenzhen Key Lab Comp Vis & Pattern Recognit, Shenzhen Inst Adv Technol, Beijing, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[4] Tencent PCG, ARC Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
CONVOLUTIONAL NETWORK;
D O I
10.1109/CVPR52729.2023.02142
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based methods have shown impressive performance in low-level vision tasks, such as image super-resolution. However, we find that these networks can only utilize a limited spatial range of input information through attribution analysis. This implies that the potential of Transformer is still not fully exploited in existing networks. In order to activate more input pixels for better reconstruction, we propose a novel Hybrid Attention Transformer (HAT). It combines both channel attention and window-based self-attention schemes, thus making use of their complementary advantages of being able to utilize global statistics and strong local fitting capability. Moreover, to better aggregate the cross-window information, we introduce an overlapping cross-attention module to enhance the interaction between neighboring window features. In the training stage, we additionally adopt a same-task pre-training strategy to exploit the potential of the model for further improvement. Extensive experiments show the effectiveness of the proposed modules, and we further scale up the model to demonstrate that the performance of this task can be greatly improved. Our overall method significantly outperforms the state-of-the-art methods by more than 1dB.
引用
收藏
页码:22367 / 22377
页数:11
相关论文
共 50 条
  • [1] Transformer for Single Image Super-Resolution
    Lu, Zhisheng
    Li, Juncheng
    Liu, Hong
    Huang, Chaoyan
    Zhang, Linlin
    Zeng, Tieyong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 456 - 465
  • [2] Activating More Information in Arbitrary-Scale Image Super-Resolution
    Zhao, Yaoqian
    Teng, Qizhi
    Chen, Honggang
    Zhang, Shujiang
    He, Xiaohai
    Li, Yi
    Sheriff, Ray E.
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7946 - 7961
  • [3] Penrose Pixels for Super-Resolution
    Ben-Ezra, Moshe
    Lin, Zhouchen
    Wilburn, Bennett
    Zhang, Wei
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (07) : 1370 - 1383
  • [4] Spatial relaxation transformer for image super-resolution
    Li, Yinghua
    Zhang, Ying
    Zeng, Hao
    He, Jinglu
    Guo, Jie
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (07)
  • [5] Dual Aggregation Transformer for Image Super-Resolution
    Chen, Zheng
    Zhang, Yulun
    Gu, Jinjin
    Kong, Linghe
    Yang, Xiaokang
    Yu, Fisher
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12278 - 12287
  • [6] Image Super-Resolution Using T-Tetromino Pixels
    Grosche, Simon
    Regensky, Andy
    Seiler, Juergen
    Kaup, Andre
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9989 - 9998
  • [7] Steformer: Efficient Stereo Image Super-Resolution With Transformer
    Lin, Jianxin
    Yin, Lianying
    Wang, Yijun
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8396 - 8407
  • [8] Efficient mixed transformer for single image super-resolution
    Zheng, Ling
    Zhu, Jinchen
    Shi, Jinpeng
    Weng, Shizhuang
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [9] ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution
    Zhang, Mingjin
    Zhang, Chi
    Zhang, Qiming
    Guo, Jie
    Gao, Xinbo
    Zhang, Jing
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 23016 - 23027
  • [10] Image Super-Resolution Using Dilated Window Transformer
    Park, Soobin
    Choi, Yong Suk
    [J]. IEEE ACCESS, 2023, 11 (60028-60039): : 60028 - 60039