Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer

被引:0
|
作者
Wang, Qingbin [1 ]
Xiong, Yuxuan [1 ]
Zhu, Hanfeng [2 ,3 ]
Mu, Xuefeng [4 ]
Zhang, Yan [4 ]
Ma, Yutao [2 ,3 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
[2] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China
[3] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China
[4] Wuhan Univ, Remin Hosp, Dept Obstet & Gynecol, Wuhan 430060, Peoples R China
关键词
Cervical cancer; Optical coherence tomography; Image classification; Self-supervised learning; Swin Transformer; Interpretability; OPTICAL COHERENCE TOMOGRAPHY;
D O I
10.1016/j.compmedimag.2024.102469
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background and Objective: Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images. Methods: In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder's feature extraction task from the decoder's reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data. Results: We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross- validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions. Conclusion: Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Few-Shot Image Classification Based on Swin Transformer + CSAM + EMD
    Sun, Huadong
    Zhang, Pengyi
    Zhang, Xu
    Han, Xiaowei
    ELECTRONICS, 2024, 13 (11)
  • [32] Utilizing Swin Transformer for the Classification of Ophthalmic Diseases in Optical Coherence Tomography (OCT) Images: A Novel Approach
    Mapanao, Jay Ryan
    Luis Lozano, Paulo
    2024 6TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND THE INTERNET, ICCCI 2024, 2024, : 94 - 100
  • [33] Surface defect detection and classification of steel using an efficient Swin Transformer
    Zhu, Wei
    Zhang, Hui
    Zhang, Chao
    Zhu, Xiaoyang
    Guan, Zhen
    Jia, Jiale
    ADVANCED ENGINEERING INFORMATICS, 2023, 57
  • [34] Swin-MSP: A Shifted Windows Masked Spectral Pretraining Model for Hyperspectral Image Classification
    Tian, Rui
    Liu, Danqing
    Bai, Yu
    Jin, Yu
    Wan, Guanliang
    Guo, Yanhui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [35] Vision Transformer With Contrastive Learning for Remote Sensing Image Scene Classification
    Bi, Meiqiao
    Wang, Minghua
    Li, Zhi
    Hong, Danfeng
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 738 - 749
  • [36] Transformer-based unsupervised contrastive learning for histopathological image classification
    Wang, Xiyue
    Yang, Sen
    Zhang, Jun
    Wang, Minghui
    Zhang, Jing
    Yang, Wei
    Huang, Junzhou
    Han, Xiao
    MEDICAL IMAGE ANALYSIS, 2022, 81
  • [37] Swin-GA-RF: genetic algorithm-based Swin Transformer and random forest for enhancing cervical cancer classification
    Alohali, Manal Abdullah
    El-Rashidy, Nora
    Alaklabi, Saad
    Elmannai, Hela
    Alharbi, Saleh
    Saleh, Hager
    FRONTIERS IN ONCOLOGY, 2024, 14
  • [38] Swin transformer with multiscale 3D atrous convolution for hyperspectral image classification
    Farooque, Ghulam
    Liu, Qichao
    Sargano, Allah Bux
    Xiao, Liang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [39] Automated classification of cervical lymph-node-level from ultrasound using Depthwise Separable Convolutional Swin Transformer
    Liu, Yanting
    Zhao, Junjuan
    Luo, Quanyong
    Shen, Chentian
    Wang, Ren
    Ding, Xuehai
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 148
  • [40] Image denoising using channel attention residual enhanced Swin Transformer
    Dai, Qiang
    Cheng, Xi
    Zhang, Li
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) : 19041 - 19059