A multi-stage multi-modal learning algorithm with adaptive multimodal fusion for improving multi-label skin lesion classification

被引:0
|
作者
Zuo, Lihan [1 ]
Wang, Zizhou [2 ]
Wang, Yan [2 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 610000, Peoples R China
[2] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
关键词
Skin lesion classification; Multi-modal learning; Multi-label classification; Multi-modal information fusion; 7-POINT CHECKLIST; DERMOSCOPIC IMAGES; NEURAL-NETWORK;
D O I
10.1016/j.artmed.2025.103091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skin cancer is frequently occurring and has become a major contributor to both cancer incidence and mortality. Accurate and timely diagnosis of skin cancer holds the potential to save lives. Deep learning- based methods have demonstrated significant advancements in the screening of skin cancers. However, most current approaches rely on a single modality input for diagnosis, thereby missing out on valuable complementary information that could enhance accuracy. Although some multimodal-based methods exist, they often lack adaptability and fail to fully leverage multimodal information. In this paper, we introduce a novel uncertainty-based hybrid fusion strategy fora multi-modal learning algorithm aimed at skin cancer diagnosis. Our approach specifically combines three different modalities: clinical images, dermoscopy images, and metadata, to make the final classification. For the fusion of two image modalities, we employ an intermediate fusion strategy that considers the similarity between clinical and dermoscopy images to extract features containing both complementary and correlated information. To capture the correlated information, we utilize cosine similarity, and we employ concatenation as the means for integrating complementary information. In the fusion of image and metadata modalities, we leverage uncertainty to obtain confident late fusion results, allowing our method to adaptively combine the information from different modalities. We conducted comprehensive experiments using a popular publicly available skin disease diagnosis dataset, and the results of these experiments demonstrate the effectiveness of our proposed method. Our proposed fusion algorithm could enhance the clinical applicability of automated skin lesion classification, offering amore robust and adaptive way to make automatic diagnoses with the help of uncertainty mechanism. Code is available at https://github.com/Zuo-Lihan/CosCatNet-Adaptive_Fusion_Algorithm.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] An Improved Multi-label Classification Ensemble Learning Algorithm
    Fu, Zhongliang
    Wang, Lili
    Zhang, Danpu
    PATTERN RECOGNITION (CCPR 2014), PT I, 2014, 483 : 243 - 252
  • [32] Feature Disentanglement and Adaptive Fusion for Improving Multi-modal Tracking
    Li, Zheng
    Cai, Weibo
    Dong, Junhao
    Lai, Jianhuang
    Xie, Xiaohua
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 68 - 80
  • [33] Drug Sensitivity Prediction Based on Multi-stage Multi-modal Drug Representation Learning
    Song, Jinmiao
    Wei, Mingjie
    Zhao, Shuang
    Zhai, Hui
    Dai, Qiguo
    Duan, Xiaodong
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2024,
  • [34] Multi-Stage Fusion and Multi-Source Attention Network for Multi-Modal Remote Sensing Image Segmentation
    Zhao, Jiaqi
    Zhou, Yong
    Shi, Boyu
    Yang, Jingsong
    Zhang, Di
    Yao, Rui
    ACM Transactions on Intelligent Systems and Technology, 2021, 12 (06):
  • [35] TFormer: A throughout fusion transformer for multi-modal skin lesion diagnosis
    Zhang, Yilan
    Xie, Fengying
    Chen, Jianqi
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 157
  • [36] Multi-Stage Fusion and Multi-Source Attention Network for Multi-Modal Remote Sensing Image Segmentation
    Zhao, Jiaqi
    Zhou, Yong
    Shi, Boyu
    Yang, Jingsong
    Zhang, Di
    Yao, Rui
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (06)
  • [37] DEEP MULTIMODAL NETWORK FOR MULTI-LABEL CLASSIFICATION
    Chen, Tanfang
    Wang, Shangfei
    Chen, Shiyu
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 955 - 960
  • [38] A multi-modal and multi-stage fusion enhancement network for segmentation based on OCT and OCTA images
    Quan, Xiongwen
    Hou, Guangyao
    Yin, Wenya
    Zhang, Han
    INFORMATION FUSION, 2025, 113
  • [39] A Two-Stage Multi-Modal Multi-Label Emotion Recognition Decision System Based on GCN
    Wu, Weiwei
    Chen, Daomin
    Li, Qingping
    INTERNATIONAL JOURNAL OF DECISION SUPPORT SYSTEM TECHNOLOGY, 2024, 16 (01)
  • [40] A Multimodal Multi-Objective Evolutionary Algorithm for Filter Feature Selection in Multi-Label Classification
    Hancer E.
    Xue B.
    Zhang M.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (09): : 1 - 14