Enhancing glaucoma detection through multi-modal integration of retinal images and clinical biomarkers

被引:0
|
作者
Sivakumar, Rishikesh [1 ]
Penkova, Anita [2 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
[2] Univ Southern Calif, Dept Aerosp & Mech Engn, Los Angeles, CA 90089 USA
关键词
Glaucoma detection; Vision Transformers; Convolutional Neural Networks; Machine Learning; Clinical Biomarkers; AUTOMATED EXTRACTION; FRAMEWORK;
D O I
10.1016/j.engappai.2025.110010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Glaucoma, a major cause of irreversible blindness globally, often progresses without early symptoms, making prompt and precise detection vital. This paper introduces a multi-modal glaucoma detection system that combines advanced deep learning architectures to analyze retinal images and clinical biomarkers. We developed three hybrid models: the first blends Vision Transformers (ViT) with Convolutional Neural Networks (CNN), specifically Residual Networks (ResNet), for comprehensive feature extraction; the second uses ObjectWindow-Location Vision Transformer (OWL-ViT) with Residual Networks for enhanced global contextual insights; and the third employs a Hierarchical Vision Transformer using Shifted Windows (Swin Transformer) with Residual Networks, which demonstrated the best performance. The strengths of these models, broad contextual capture by ViT, localized detail extraction by CNNs, and refined granularity by Swin Transformer, thereby improving both feature representation and computational efficiency, make them well-suited for clinical use. The best-optimized system, featuring the Swin Transformer hybrid model, achieved an F1-score of 0.993 for glaucoma and 0.995 for non-glaucoma, with an overall accuracy of 99.4% on a dataset of 2874 new cases, correctly classifying 2857 of them, thus confirming its efficacy in enhancing early-stage glaucoma detection and significantly advancing over existing methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Interpretable multi-modal data integration
    Daniel Osorio
    Nature Computational Science, 2022, 2 : 8 - 9
  • [22] Unsupervised Change Detection in Multi-Modal SAR Images using CycleGAN
    Bergamasco, Luca
    Bovolo, Francesca
    Proceedings of SPIE - The International Society for Optical Engineering, 2022, 12267
  • [23] Unsupervised Change Detection in Multi-Modal SAR Images using CycleGAN
    Bergamasco, Luca
    Bovolo, Francesca
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXVIII, 2022, 12267
  • [24] BRAIN MEASUREMENTS, PERIPHERAL BIOMARKERS, OR BOTH? IMPROVED PREDICTIONS OF RESPONSE TO SERTRALINE THROUGH MULTI-MODAL DATA INTEGRATION.
    Grant, C.
    Barac, M.
    Mayes, T.
    Carmody, T.
    Minhajuddin, A.
    Jha, M.
    Croarkin, P.
    Bobo, W.
    Toll, R.
    Fatt, C. Chin
    Athreya, A.
    Trivedi, M.
    CLINICAL PHARMACOLOGY & THERAPEUTICS, 2024, 115 : S6 - S6
  • [25] Flood Detection Using Multi-Modal and Multi-Temporal Images: A Comparative Study
    Islam, Kazi Aminul
    Uddin, Mohammad Shahab
    Kwan, Chiman
    Li, Jiang
    REMOTE SENSING, 2020, 12 (15)
  • [26] Modification of polyetheretherketone implants: From enhancing bone integration to enabling multi-modal therapeutics
    He, Miaomiao
    Huang, Yong
    Xu, Huan
    Feng, Ganjun
    Liu, Limin
    Li, Yubao
    Sun, Dan
    Zhang, Li
    ACTA BIOMATERIALIA, 2021, 129 : 18 - 32
  • [27] BRAIN MEASUREMENTS, PERIPHERAL BIOMARKERS, OR BOTH? IMPROVED PREDICTIONS OF RESPONSE TO SERTRALINE THROUGH MULTI-MODAL DATA INTEGRATION.
    Grant, C.
    Barac, M.
    Mayes, T.
    Carmody, T.
    Minhajuddin, A.
    Jha, M.
    Croarkin, P.
    Bobo, W.
    Toll, R.
    Fatt, C. Chin
    Athreya, A.
    Trivedi, M.
    CLINICAL PHARMACOLOGY & THERAPEUTICS, 2024, 115 : S122 - S123
  • [28] A Multi-modal Approach for Enhancing Object Placement
    Srimal, P. H. D. Arjuna S.
    Jayasekara, A. G. Buddhika P.
    PROCEEDINGS OF THE 2017 6TH NATIONAL CONFERENCE ON TECHNOLOGY & MANAGEMENT (NCTM) - EXCEL IN RESEARCH AND BUILD THE NATION, 2017, : 17 - 22
  • [29] Adapting the segment anything model for multi-modal retinal anomaly detection and localization
    Li, Jingtao
    Chen, Ting
    Wang, Xinyu
    Zhong, Yanfei
    Xiao, Xuan
    INFORMATION FUSION, 2025, 113
  • [30] Integration of transgender health: A multi-modal approach
    Paradiso, Catherine
    Arca-Contreras, Karen
    Brillhart, Susan J.
    Macchiarola, Jennifer
    Curcio, Danna L.
    TEACHING AND LEARNING IN NURSING, 2022, 17 (04) : 425 - 432