A multi-stage multi-modal learning algorithm with adaptive multimodal fusion for improving multi-label skin lesion classification

被引:0
|
作者
Zuo, Lihan [1 ]
Wang, Zizhou [2 ]
Wang, Yan [2 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 610000, Peoples R China
[2] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
关键词
Skin lesion classification; Multi-modal learning; Multi-label classification; Multi-modal information fusion; 7-POINT CHECKLIST; DERMOSCOPIC IMAGES; NEURAL-NETWORK;
D O I
10.1016/j.artmed.2025.103091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skin cancer is frequently occurring and has become a major contributor to both cancer incidence and mortality. Accurate and timely diagnosis of skin cancer holds the potential to save lives. Deep learning- based methods have demonstrated significant advancements in the screening of skin cancers. However, most current approaches rely on a single modality input for diagnosis, thereby missing out on valuable complementary information that could enhance accuracy. Although some multimodal-based methods exist, they often lack adaptability and fail to fully leverage multimodal information. In this paper, we introduce a novel uncertainty-based hybrid fusion strategy fora multi-modal learning algorithm aimed at skin cancer diagnosis. Our approach specifically combines three different modalities: clinical images, dermoscopy images, and metadata, to make the final classification. For the fusion of two image modalities, we employ an intermediate fusion strategy that considers the similarity between clinical and dermoscopy images to extract features containing both complementary and correlated information. To capture the correlated information, we utilize cosine similarity, and we employ concatenation as the means for integrating complementary information. In the fusion of image and metadata modalities, we leverage uncertainty to obtain confident late fusion results, allowing our method to adaptively combine the information from different modalities. We conducted comprehensive experiments using a popular publicly available skin disease diagnosis dataset, and the results of these experiments demonstrate the effectiveness of our proposed method. Our proposed fusion algorithm could enhance the clinical applicability of automated skin lesion classification, offering amore robust and adaptive way to make automatic diagnoses with the help of uncertainty mechanism. Code is available at https://github.com/Zuo-Lihan/CosCatNet-Adaptive_Fusion_Algorithm.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] FusionM4Net: A multi-stage multi-modal learning algorithm for multi-label skin lesion classification
    Tang, Peng
    Yan, Xintong
    Nan, Yang
    Xiang, Shao
    Krammer, Sebastian
    Lasser, Tobias
    MEDICAL IMAGE ANALYSIS, 2022, 76
  • [2] Multi-modal bilinear fusion with hybrid attention mechanism for multi-label skin lesion classification
    Wei, Yun
    Ji, Lin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 65221 - 65247
  • [3] ISAFusionNet: Involution and soft attention based deep multi-modal fusion network for multi-label skin lesion classification
    Mohammed, Hussein M. A.
    Omeroglu, Asli Nur
    Oral, Emin Argun
    Ozbek, I. Yucel
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 122
  • [4] A novel soft attention-based multi-modal deep learning framework for multi-label skin lesion classification
    Omeroglu, Asli Nur
    Mohammed, Hussein M. A.
    Oral, Emin Argun
    Aydin, Serdar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 120
  • [5] Collaboration based multi-modal multi-label learning
    Zhang, Yi
    Zhu, Yinlong
    Zhang, Zhecheng
    Wang, Chongjung
    APPLIED INTELLIGENCE, 2022, 52 (12) : 14204 - 14217
  • [6] Collaboration based multi-modal multi-label learning
    Yi Zhang
    Yinlong Zhu
    Zhecheng Zhang
    Chongjung Wang
    Applied Intelligence, 2022, 52 : 14204 - 14217
  • [7] Multi-modal Contextual Prompt Learning for Multi-label Classification with Partial Labels
    Wang, Rui
    Pan, Zhengxin
    Wu, Fangyu
    Lv, Yifan
    Zhang, Bailing
    2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 517 - 524
  • [8] Rethinking Modal-oriented Label Correlations for Multi-modal Multi-label Learning
    Zhang, Yi
    Shen, Jundong
    Zhang, Zhecheng
    Zhang, Lei
    Wang, Chongjun
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [9] A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification
    Song, Lingyun
    Liu, Jun
    Qian, Buyue
    Sun, Mingxuan
    Yang, Kuan
    Sun, Meng
    Abbas, Samar
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (12) : 6025 - 6038
  • [10] MULTIMODAL LEARNING FOR MULTI-LABEL IMAGE CLASSIFICATION
    Pang, Yanwei
    Ma, Zhao
    Yuan, Yuan
    Li, Xuelong
    Wang, Kongqiao
    2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 1797 - 1800