Deformable transformer for endoscopic video super-resolution

被引:3
|
作者
Song, Xiaowei [1 ]
Tang, Hui [1 ,2 ]
Yang, Chunfeng [1 ,2 ]
Zhou, Guangquan [3 ]
Wang, Yangang [4 ]
Huang, Xinjun [4 ]
Hua, Jie [5 ,6 ]
Coatrieux, Gouenou [7 ]
He, Xiaopu [8 ]
Chen, Yang [1 ,2 ,9 ]
机构
[1] Southeast Univ, Sch Comp Sci & Engn, Nanjing, Peoples R China
[2] Southeast Univ, Sch Comp Sci & Engn, Jiangsu Prov Joint Int Res Lab Med Informat Proc, Nanjing, Peoples R China
[3] Southeast Univ, Sch Biol Sci & Med Engn, Nanjing, Peoples R China
[4] Nanjing Tuge Healthcare Co Ltd, Nanjing, Peoples R China
[5] Liyang Peoples Hosp, Liyang Branch Hosp, Dept Gastroenterol, Jiangsu Prov Hosp, Liyang, Peoples R China
[6] Nanjing Med Univ, Dept Gastroenterol, Affiliated Hosp 1, Nanjing, Peoples R China
[7] Mines Telecom, Telecom Bretagne, INSERM LaTIM U1101, Brest, France
[8] Nanjing Med Univ, Dept Geriatr Gastroenterol, Affiliated Hosp 1, Nanjing, Peoples R China
[9] Minist Educ, Key Lab Comp Network & Informat Integrat, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Video super-resolution; Endoscopic; Transformer; Deformable convolution; Self-attention; CONVOLUTION;
D O I
10.1016/j.bspc.2022.103827
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Video super-resolution aims to reconstruct a high-resolution video from a low-resolution video corresponding to a magnification scale. Video super-resolution, as a fundamental computer vision task, is widely used in various fields. Particularly, in the field of endoscopic, high-resolution endoscopic videos help doctors to observe more details of lesions and improve the accuracy and speed of diagnosis. A novel deformable Transformer network is proposed to solve the super-resolution problem of endoscopic video data. To address the problem that the Transformer's self-attention module cannot effectively capture local information, the self-attention module is improved by using convolution operations to increase the local feature capture capability of the self-attention module. In order to compensate for the deficiency of Transformer for continuous inter-frame alignment, a new bidirectional deformable convolutional network is designed as the feed-forward module of Transformer to achieve frame-to-frame feature alignment and feature propagation using deformable convolution. A highresolution dataset for endoscopic video super-resolution is produced using endoscopic surgery videos. Our proposed deformable Transformer network is demonstrated to have the best performance with the competitive number of parameters in endoscopic imaging so far by comparing the performance of other existing video superresolution methods in the endoscopic dataset through sufficient experiments. Our proposed deformable Transformer network improves the PSNR metric by 0.97 dB over the state-of-the-art method in the RGB channel, while reducing the number of network parameters by 0.39 million.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Understanding Deformable Alignment in Video Super-Resolution
    Chan, Kelvin C. K.
    Wang, Xintao
    Yu, Ke
    Dong, Chao
    Loy, Chen Change
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 973 - 981
  • [2] Super-Resolution of Video Using Deformable Patches
    Zhu, Yu
    Zhang, Yanning
    Sun, Jinqiu
    [J]. INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: IMAGE AND VIDEO DATA ENGINEERING, ISCIDE 2015, PT I, 2015, 9242 : 647 - 656
  • [3] MSDformer: Multiscale Deformable Transformer for Hyperspectral Image Super-Resolution
    Chen, Shi
    Zhang, Lefei
    Zhang, Liangpei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [4] A New Dataset and Transformer for Stereoscopic Video Super-Resolution
    Imani, Hassan
    Islam, Md Baharul
    Wong, Lai-Kuan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 705 - 714
  • [5] Self-guided Transformer for Video Super-Resolution
    Xue, Tong
    Wang, Qianrui
    Huang, Xinyi
    Li, Dengshi
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2024, 14434 LNCS : 186 - 198
  • [6] Self-guided Transformer for Video Super-Resolution
    Xue, Tong
    Wang, Qianrui
    Huang, Xinyi
    Li, Dengshi
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 186 - 198
  • [7] Deformable 3D Convolution for Video Super-Resolution
    Ying, Xinyi
    Wang, Longguang
    Wang, Yingqian
    Sheng, Weidong
    An, Wei
    Guo, Yulan
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 1500 - 1504
  • [8] Fast Online Video Super-Resolution with Deformable Attention Pyramid
    Fuoli, Dario
    Danelljan, Martin
    Timofte, Radu
    Van Gool, Luc
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1735 - 1744
  • [9] Deformable Non-Local Network for Video Super-Resolution
    Wang, Hua
    Su, Dewei
    Liu, Chuangchuang
    Jin, Longcun
    Sun, Xianfang
    Peng, Xinyi
    [J]. IEEE ACCESS, 2019, 7 : 177734 - 177744
  • [10] Reference-Based Image Super-Resolution with Deformable Attention Transformer
    Cao, Jiezhang
    Liang, Jingyun
    Zhang, Kai
    Li, Yawei
    Zhang, Yulun
    Wang, Wenguan
    Van Gool, Luc
    [J]. COMPUTER VISION - ECCV 2022, PT XVIII, 2022, 13678 : 325 - 342