A Novel Transformer Model With Multiple Instance Learning for Diabetic Retinopathy Classification

被引:3
|
作者
Yang, Yaoming [1 ]
Cai, Zhili [1 ]
Qiu, Shuxia [1 ,2 ]
Xu, Peng [1 ,2 ]
机构
[1] China Jiliang Univ, Coll Sci, Hangzhou 310018, Peoples R China
[2] Key Lab Intelligent Mfg Qual Big Data Tracing & An, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Vision Transformer; multiple instance learning; diabetic retinopathy; high-resolution fundus retinal images; medical image classification; DISEASE; IMAGES;
D O I
10.1109/ACCESS.2024.3351473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetic retinopathy (DR) is an irreversible fundus retinopathy. A deep learning-based auto-mated DR diagnosis system can save diagnostic time. While Transformer has shown superior performance compared to Convolutional Neural Network (CNN), it typically requires pre-training with large amounts of data. Although Transformer-based DR diagnosis method may alleviate the problem of limited performance on small-scale retinal datasets by loading pre-trained weights, the size of input images is restricted to 224 x 224. The resolution of retinal images captured by fundus cameras is much higher than 224 x 224, reducing resolution in training will result in the loss of valuable information. In order to efficiently utilize high-resolution retinal images, a new Transformer model with multiple instance learning (TMIL) is proposed for DR classification. A multiple instance learning approach is firstly applied on the retinal images to segment these high-resolution images into 224 x 224 image patches. Subsequently, Vision Transformer (ViT) is used to extract features from each patch. Then, Global Instance Computing Block (GICB) is designed to calculate the inter-instance features. After introducing global information from GICB, the features are used to output the classification results. When using high-resolution retinal images, TMIL can load pre-trained weights of Transformer without being affected by weight interpolation on model performance. Experimental results using the APTOS dataset and the Messidor-1 dataset demonstrate that TMIL achieves better classification performance and reduces inference time by 62% compared with that directly inputting high-resolution images into ViT. And TMIL shows highest classification accuracy compared with the current state-of-the-art results.
引用
收藏
页码:6768 / 6776
页数:9
相关论文
共 50 条
  • [31] Sparse multiple instance learning as document classification
    Shengye Yan
    Xiaodong Zhu
    Guoqing Liu
    Jianxin Wu
    Multimedia Tools and Applications, 2017, 76 : 4553 - 4570
  • [32] Novel Framework for Enhanced Learning-based Classification of Lesion in Diabetic Retinopathy
    Prakruthi, M. K.
    Komarasamy, G.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 37 - 45
  • [33] Adaptive machine learning classification for diabetic retinopathy
    Laxmi Math
    Ruksar Fatima
    Multimedia Tools and Applications, 2021, 80 : 5173 - 5186
  • [34] Diabetic Retinopathy Classification Using Deep Learning
    Sathwik A.S.
    Agarwal R.
    Ajith Jubilson E.
    Basa S.S.
    EAI Endorsed Transactions on Pervasive Health and Technology, 2023, 9
  • [35] Adaptive machine learning classification for diabetic retinopathy
    Math, Laxmi
    Fatima, Ruksar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (04) : 5173 - 5186
  • [36] Multiple instance learning for medical image classification based on instance importance
    Struski, Lukasz
    Janusz, Szymon
    Tabor, Jacek
    Markiewicz, Michal
    Lewicki, Arkadiusz
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 91
  • [37] MULTIPLE INSTANCE LEARNING WITH CRITICAL INSTANCE FOR WHOLE SLIDE IMAGE CLASSIFICATION
    Zhou, Yuanpin
    Lu, Yao
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [38] A Deep Learning Approach to Diabetic Retinopathy Classification
    Oishi, Anika Mehjabin
    Tawfiq-Uz-Zaman, Md
    Emon, Mohammad Billal Hossain
    Momen, Sifat
    CYBERNETICS PERSPECTIVES IN SYSTEMS, VOL 3, 2022, 503 : 417 - 425
  • [39] Self-Supervised Equivariant Regularization Reconciles Multiple-Instance Learning: Joint Referable Diabetic Retinopathy Classification a nd L esion Segmentation
    Zhu, Wenhui
    Qiu, Peijie
    Lepore, Natasha
    Dumitrascu, Oana M.
    Wang, Yalin
    Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12567
  • [40] Deep learning CS-ResNet-101 model for diabetic retinopathy classification
    Suo, Yaohong
    He, Zhaokun
    Liu, Yicun
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 97