OCTFormer: An Efficient Hierarchical Transformer Network Specialized for Retinal Optical Coherence Tomography Image Recognition

被引:1
|
作者
Wang, Haoran [1 ]
Guo, Xinyu [1 ]
Song, Kaiwen [1 ]
Sun, Mingyang [1 ]
Shao, Yanbin [1 ]
Xue, Songfeng [1 ]
Zhang, Hongwei [1 ]
Zhang, Tianyu [1 ]
机构
[1] Jilin Univ, Coll Instrumentat & Elect Engn, Key Lab Geophys Explorat Equipment, Minist Educ, Changchun 130000, Peoples R China
关键词
Computer-aided diagnosis; deep learning; image classification; optical coherence tomography (OCT); vision transformer (ViT); MACULAR EDEMA; MECHANISMS;
D O I
10.1109/TIM.2023.3329106
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Diabetic retinopathy (DR) is a common complication of diabetes and one of the main causes of blindness in humans, which can be prevented by early-stage detection and treatment. Clinically, ophthalmologists use optical coherence tomography (OCT) image analysis as a basis for diagnosing DR. The existing medical resources can no longer meet the needs of the escalating patient population. Therefore, deep-learning technology has become a mainstream solution for medical image analysis. Vision transformer (ViT), a new neural network structure, has demonstrated great performance in analyzing images. However, due to the lack of inductive bias and prohibition of input image changes in size, ViT cannot avoid over-fitting problems on small datasets and limits the model to biological tissue characteristics. Thus, we propose an OCT multihead self-attention (OMHSA) block that especially calculates OCT image information based on a hybrid CNN-Transformer strategy. Compared to traditional MHSA, OMHSA integrates local information extraction differences into the calculation of self-attention and adds local information to the transformer model without relying on a multibranch network establishment. We built a neural network architecture (OCTFormer) by stacking convolutional layers and OMHSA blocks repeatedly in each stage. Similar to CNN, OCTFormer allows input size change at each stage to achieve a hierarchical structure effect. The model diagnosis effectiveness on the collected retinal OCT dataset was evaluated, and the accuracy reached 98.60%, surpassing the state-of-the-art (SOTA) model. The OCTFormer deployment to mobile terminals through knowledge distillation technology was shown, which presented a reference for deploying transformer models to actual clinical environments.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [1] HCTNet: A Hybrid ConvNet-Transformer Network for Retinal Optical Coherence Tomography Image Classification
    Ma, Zongqing
    Xie, Qiaoxue
    Xie, Pinxue
    Fan, Fan
    Gao, Xinxiao
    Zhu, Jiang
    BIOSENSORS-BASEL, 2022, 12 (07):
  • [2] An interpretable transformer network for the retinal disease classification using optical coherence tomography
    Jingzhen He
    Junxia Wang
    Zeyu Han
    Jun Ma
    Chongjing Wang
    Meng Qi
    Scientific Reports, 13
  • [3] An interpretable transformer network for the retinal disease classification using optical coherence tomography
    He, Jingzhen
    Wang, Junxia
    Han, Zeyu
    Ma, Jun
    Wang, Chongjing
    Qi, Meng
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [4] Retinal optical coherence tomography image classification with label smoothing generative adversarial network
    He, Xingxin
    Fang, Leyuan
    Rabbani, Hossein
    Chen, Xiangdong
    Liu, Zhimin
    NEUROCOMPUTING, 2020, 405 : 37 - 47
  • [5] Retinal image registration in optical coherence tomography and fluorescence imaging
    Otesteanu, Corin
    Robledo, Lucio
    Zinkernagel, Martin Sebastian
    Sznitman, Raphael
    Marquez-Neila, Pablo
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
  • [6] A Statistical Model of Retinal Optical Coherence Tomography Image Data
    Kulkarni, Prathamesh
    Lozano, Diana
    Zouridakis, George
    Twa, Michael
    2011 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2011, : 6127 - 6130
  • [7] Multi-task generative adversarial network for retinal optical coherence tomography image denoising
    Xie, Qiaoxue
    Ma, Zongqing
    Zhu, Lianqing
    Fan, Fan
    Meng, Xiaochen
    Gao, Xinxiao
    Zhu, Jiang
    PHYSICS IN MEDICINE AND BIOLOGY, 2023, 68 (04):
  • [8] MBT: Model-Based Transformer for retinal optical coherence tomography image and video multi-classification
    Hammou, Badr Ait
    Antaki, Fares
    Boucher, Marie-Carole
    Duval, Renaud
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2023, 178
  • [9] Evaluation of image artifact produced by optical coherence tomography of retinal pathology
    Ray, R
    Stinnett, SS
    Jaffe, GJ
    AMERICAN JOURNAL OF OPHTHALMOLOGY, 2005, 139 (01) : 18 - 29
  • [10] State-of-the-art in retinal optical coherence tomography image analysis
    Baghaie, Ahmadreza
    Yu, Zeyun
    D'Souza, Roshan M.
    QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2015, 5 (04) : 603 - 617