A multi-stage multi-modal learning algorithm with adaptive multimodal fusion for improving multi-label skin lesion classification

被引:0
|
作者
Zuo, Lihan [1 ]
Wang, Zizhou [2 ]
Wang, Yan [2 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 610000, Peoples R China
[2] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
关键词
Skin lesion classification; Multi-modal learning; Multi-label classification; Multi-modal information fusion; 7-POINT CHECKLIST; DERMOSCOPIC IMAGES; NEURAL-NETWORK;
D O I
10.1016/j.artmed.2025.103091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skin cancer is frequently occurring and has become a major contributor to both cancer incidence and mortality. Accurate and timely diagnosis of skin cancer holds the potential to save lives. Deep learning- based methods have demonstrated significant advancements in the screening of skin cancers. However, most current approaches rely on a single modality input for diagnosis, thereby missing out on valuable complementary information that could enhance accuracy. Although some multimodal-based methods exist, they often lack adaptability and fail to fully leverage multimodal information. In this paper, we introduce a novel uncertainty-based hybrid fusion strategy fora multi-modal learning algorithm aimed at skin cancer diagnosis. Our approach specifically combines three different modalities: clinical images, dermoscopy images, and metadata, to make the final classification. For the fusion of two image modalities, we employ an intermediate fusion strategy that considers the similarity between clinical and dermoscopy images to extract features containing both complementary and correlated information. To capture the correlated information, we utilize cosine similarity, and we employ concatenation as the means for integrating complementary information. In the fusion of image and metadata modalities, we leverage uncertainty to obtain confident late fusion results, allowing our method to adaptively combine the information from different modalities. We conducted comprehensive experiments using a popular publicly available skin disease diagnosis dataset, and the results of these experiments demonstrate the effectiveness of our proposed method. Our proposed fusion algorithm could enhance the clinical applicability of automated skin lesion classification, offering amore robust and adaptive way to make automatic diagnoses with the help of uncertainty mechanism. Code is available at https://github.com/Zuo-Lihan/CosCatNet-Adaptive_Fusion_Algorithm.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Multi-label movie genre classification based on multimodal fusion
    Zihui Cai
    Hongwei Ding
    Jinlu Wu
    Ying Xi
    Xuemeng Wu
    Xiaohui Cui
    Multimedia Tools and Applications, 2024, 83 : 36823 - 36840
  • [22] Multi-modal multi-label semantic indexing of images based on hybrid ensemble learning
    Li, Wei
    Sun, Maosong
    Habel, Christopher
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2007, 2007, 4810 : 744 - +
  • [23] Learning to Annotate Clothes in Everyday Photos: Multi-Modal, Multi-Label, Multi-Instance Approach
    Nogueira, Keiller
    Veloso, Adriano Alonso
    dos Santos, Jefersson A.
    2014 27TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2014, : 327 - 334
  • [24] MIC: Breast Cancer Multi-label Diagnostic Framework Based on Multi-modal Fusion Interaction
    Chen, Ziyan
    Yi, Sanli
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2025,
  • [25] Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport
    Yang, Yang
    Wu, Yi-Feng
    Zhan, De-Chuan
    Liu, Zhi-Bin
    Jiang, Yuan
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2594 - 2603
  • [26] Micro-video multi-label classification method based on multi-modal feature encoding
    Jing P.
    Li Y.
    Su Y.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (04): : 109 - 117
  • [27] Multi-label remote sensing classification with self-supervised gated multi-modal transformers
    Liu, Na
    Yuan, Ye
    Wu, Guodong
    Zhang, Sai
    Leng, Jie
    Wan, Lihong
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2024, 18
  • [28] Context Recognition In-the-Wild: Unified Model for Multi-Modal Sensors and Multi-Label Classification
    Vaizman, Yonatan
    Weibel, Nadir
    Lanckriet, Gert
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2017, 1 (04)
  • [29] Unconstrained Multimodal Multi-Label Learning
    Huang, Yan
    Wang, Wei
    Wang, Liang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1923 - 1935
  • [30] A multi-label image classification method combining multi-stage image semantic information and label relevance
    Wu, Liwen
    Zhao, Lei
    Tang, Peigeng
    Pu, Bin
    Jin, Xin
    Zhang, Yudong
    Yao, Shaowen
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (09) : 3911 - 3925