Enhancing glaucoma detection through multi-modal integration of retinal images and clinical biomarkers

被引:0
|
作者
Sivakumar, Rishikesh [1 ]
Penkova, Anita [2 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
[2] Univ Southern Calif, Dept Aerosp & Mech Engn, Los Angeles, CA 90089 USA
关键词
Glaucoma detection; Vision Transformers; Convolutional Neural Networks; Machine Learning; Clinical Biomarkers; AUTOMATED EXTRACTION; FRAMEWORK;
D O I
10.1016/j.engappai.2025.110010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Glaucoma, a major cause of irreversible blindness globally, often progresses without early symptoms, making prompt and precise detection vital. This paper introduces a multi-modal glaucoma detection system that combines advanced deep learning architectures to analyze retinal images and clinical biomarkers. We developed three hybrid models: the first blends Vision Transformers (ViT) with Convolutional Neural Networks (CNN), specifically Residual Networks (ResNet), for comprehensive feature extraction; the second uses ObjectWindow-Location Vision Transformer (OWL-ViT) with Residual Networks for enhanced global contextual insights; and the third employs a Hierarchical Vision Transformer using Shifted Windows (Swin Transformer) with Residual Networks, which demonstrated the best performance. The strengths of these models, broad contextual capture by ViT, localized detail extraction by CNNs, and refined granularity by Swin Transformer, thereby improving both feature representation and computational efficiency, make them well-suited for clinical use. The best-optimized system, featuring the Swin Transformer hybrid model, achieved an F1-score of 0.993 for glaucoma and 0.995 for non-glaucoma, with an overall accuracy of 99.4% on a dataset of 2874 new cases, correctly classifying 2857 of them, thus confirming its efficacy in enhancing early-stage glaucoma detection and significantly advancing over existing methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Is Multi-Modal Necessarily Better? Robustness Evaluation of Multi-Modal Fake News Detection
    Chen, Jinyin
    Jia, Chengyu
    Zheng, Haibin
    Chen, Ruoxi
    Fu, Chenbo
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (06): : 3144 - 3158
  • [42] Multi-Modal Detection of Man-Made Objects in Simulated Aerial Images
    Baran, Matthew S.
    Tutwiler, Richard L.
    Natale, Donald J.
    Bassett, Michael S.
    Harner, Matthew P.
    ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY XIX, 2013, 8743
  • [43] CONCEPT DETECTION IN LONGITUDINAL BRAIN MR IMAGES USING MULTI-MODAL CUES
    Caban, Jesus J.
    Lee, Noah
    Ebadollahi, Shahram
    Laine, Andrew E.
    Kender, John R.
    2009 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, VOLS 1 AND 2, 2009, : 418 - +
  • [44] Multi-modal Detection of Cyberbullying on Twitter
    Qiu, Jiabao
    Moh, Melody
    Moh, Teng-Sheng
    ACMSE 2022: PROCEEDINGS OF THE 2022 ACM SOUTHEAST CONFERENCE, 2022, : 9 - 16
  • [45] UNSUPERVISED BUILDING CHANGE DETECTION IN MULTI-MODAL SAR IMAGES USING CYCLEGAN
    Bergamasco, Luca
    Bovolo, Francesca
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 483 - 486
  • [46] A New Multi-modal Technique for Bib Number/Text Detection in Natural Images
    Roy, Sangheeta
    Shivakumara, Palaiahnakote
    Mondal, Prabir
    Raghavendra, R.
    Pal, Umapada
    Lu, Tong
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2015, PT I, 2015, 9314 : 483 - 494
  • [47] Multi-modal human aggression detection
    Kooij, J. F. P.
    Liem, M. C.
    Krijnders, J. D.
    Andringa, T. C.
    Gavrila, D. M.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2016, 144 : 106 - 120
  • [48] Multi-modal and multi-scale retinal imaging with angiography
    Shirazi, Muhammad Faizan
    Andilla, Jordi
    Cunquero, Marina
    Lefaudeux, Nicolas
    De Jesus, Danilo Andrade
    Brea, Luisa Sanchez
    Klein, Stefan
    van Walsum, Theo
    Grieve, Kate
    Paques, Michel
    Torm, Marie Elise Wistrup
    Larsen, Michael
    Loza-Alvarez, Pablo
    Levecq, Xavier
    Chateau, Nicolas
    Pircher, Michael
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2021, 62 (08)
  • [49] Multi-modal novelty and familiarity detection
    Christo Panchev
    BMC Neuroscience, 14 (Suppl 1)
  • [50] Multi-Modal Depression Detection and Estimation
    Yang, Le
    2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2019, : 26 - 30