Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer

被引:0
|
作者
Wang, Qingbin [1 ]
Xiong, Yuxuan [1 ]
Zhu, Hanfeng [2 ,3 ]
Mu, Xuefeng [4 ]
Zhang, Yan [4 ]
Ma, Yutao [2 ,3 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
[2] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China
[3] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China
[4] Wuhan Univ, Remin Hosp, Dept Obstet & Gynecol, Wuhan 430060, Peoples R China
关键词
Cervical cancer; Optical coherence tomography; Image classification; Self-supervised learning; Swin Transformer; Interpretability; OPTICAL COHERENCE TOMOGRAPHY;
D O I
10.1016/j.compmedimag.2024.102469
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background and Objective: Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images. Methods: In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder's feature extraction task from the decoder's reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data. Results: We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross- validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions. Conclusion: Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Masked autoencoders with handcrafted feature predictions: Transformer for weakly supervised esophageal cancer classification
    Bai, Yunhao
    Li, Wenqi
    An, Jianpeng
    Xia, Lili
    Chen, Huazhen
    Zhao, Gang
    Gao, Zhongke
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 244
  • [22] SpectralSWIN: a spectral-swin transformer network for hyperspectral image classification
    Ayas, Selen
    Tunc-Gormus, Esra
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (11) : 4025 - 4044
  • [23] Image Classification using Deep Autoencoders
    Gogoi, Munmi
    Begum, Shahin Ara
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2017, : 556 - 560
  • [24] MSMT-LCL: Multiscale Spatial-Spectral Masked Transformer With Local Contrastive Learning for Hyperspectral Image Classification
    Zhou, Yunfei
    Huang, Xiaohui
    Yang, Xiaofei
    Peng, Jiangtao
    Ban, Yifang
    Jiang, Nan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [25] SELF PRE-TRAINING WITH MASKED AUTOENCODERS FOR MEDICAL IMAGE CLASSIFICATION AND SEGMENTATION
    Zhou, Lei
    Liu, Huidong
    Bae, Joseph
    He, Junjun
    Samaras, Dimitris
    Prasanna, Prateek
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [26] Mammographic Breast Composition Classification Using Swin Transformer Network
    Tsai, Kuen-Jang
    Yeh, Wei-Cheng
    Kao, Cheng-Yi
    Lin, Ming -Wei
    Hung, Chao -Ming
    Chi, Hung-Ying
    Yeh, Cheng-Yu
    Hwang, Shaw-Hwa
    SENSORS AND MATERIALS, 2024, 36 (05) : 1951 - 1957
  • [27] Efficient Transformer Inference for Extremely Weak Edge Devices using Masked Autoencoders
    Liu, Tao
    Li, Peng
    Gu, Yu
    Liu, Peng
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1718 - 1723
  • [28] Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification
    Shiri, Mohammad
    Reddy, Monalika Padma
    Sun, Jiangwen
    2024 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI 2024, 2024, : 296 - 301
  • [29] Classification of maize growth stages using the Swin Transformer model
    Fu L.
    Huang H.
    Wang H.
    Huang S.
    Chen D.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (14): : 191 - 200
  • [30] CLASSIFICATION AND DIAGNOSIS OF AUTISM SPECTRUM DISORDER USING SWIN TRANSFORMER
    Zhang, Heqian
    Wang, Zhaohui
    Zhan, Yuefu
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,