MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification

被引:0
|
作者
Huo, Xiangzuo [1 ,3 ]
Tian, Shengwei [2 ]
Yu, Long [2 ]
Zhang, Wendong [2 ]
Li, Aolun [2 ]
Yang, Qimeng [2 ]
Song, Jinmiao [2 ]
机构
[1] Tianjin Agr Univ, Sch Comp & Informat Engn, Tianjin 300384, Peoples R China
[2] Xinjiang Univ, Sch Software, Urumqi 830000, Xinjiang, Peoples R China
[3] Minist Agr & Rural Affairs, Key Lab Smart Breeding Coconstruct Minist & Prov, Beijing 100125, Peoples R China
基金
中国国家自然科学基金;
关键词
Esophagus Cancer; Multi-modal Multi-task Learning; Feature Fusion; Hybrid Network; Self-attention; DIAGNOSIS;
D O I
10.1007/s40747-024-01708-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Esophageal cancer is a globally significant but understudied type of cancer with high mortality rates. The staging and differentiation of esophageal cancer are crucial factors in determining the prognosis and surgical treatment plan for patients, as well as improving their chances of survival. Endoscopy and histopathological examination are considered as the gold standard for esophageal cancer diagnosis. However, some previous studies have employed deep learning-based methods for esophageal cancer analysis, which are limited to single-modal features, resulting in inadequate classification results. In response to these limitations, multi-modal learning has emerged as a promising alternative for medical image analysis tasks. In this paper, we propose a hierarchical feature fusion network, MM-HiFuse, for multi-modal multitask learning to improve the classification accuracy of esophageal cancer staging and differentiation level. The proposed architecture combines low-level to deep-level features of both pathological and endoscopic images to achieve accurate classification results. The key characteristics of MM-HiFuse include: (i) a parallel hierarchy of convolution and self-attention layers specifically designed for pathological and endoscopic image features; (ii) a multi-modal hierarchical feature fusion module (MHF) and a new multitask weighted combination loss function. The benefits of these features are the effective extraction of multi-modal representations at different semantic scales and the mutual complementarity of the multitask learning, leading to improved classification performance. Experimental results demonstrate that MM-HiFuse outperforms single-modal methods in esophageal cancer staging and differentiation classification. Our findings provide evidence for the early diagnosis and accurate staging of esophageal cancer and serve as a new inspiration for the application of multi-modal multitask learning in medical image analysis. Code is available at https://github.com/huoxiangzuo/MM-HiFuse.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Multi-modal multi-task feature fusion for RGBT tracking
    Cai, Yujue
    Sui, Xiubao
    Gu, Guohua
    INFORMATION FUSION, 2023, 97
  • [2] Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification
    Luo, Yong
    Wen, Yonggang
    Tao, Dacheng
    Gui, Jie
    Xu, Chao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (01) : 414 - 427
  • [3] Multi-modal microblog classification via multi-task learning
    Sicheng Zhao
    Hongxun Yao
    Sendong Zhao
    Xuesong Jiang
    Xiaolei Jiang
    Multimedia Tools and Applications, 2016, 75 : 8921 - 8938
  • [4] Multi-modal microblog classification via multi-task learning
    Zhao, Sicheng
    Yao, Hongxun
    Zhao, Sendong
    Jiang, Xuesong
    Jiang, Xiaolei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (15) : 8921 - 8938
  • [5] MBFusion: Multi-modal balanced fusion and multi-task learning for cancer diagnosis and prognosis
    Zhang, Ziye
    Yin, Wendong
    Wang, Shijin
    Zheng, Xiaorou
    Dong, Shoubin
    Computers in Biology and Medicine, 2024, 181
  • [6] Multi-Modal Fusion for Multi-Task Fuzzy Detection of Rail Anomalies
    Liyuan, Yang
    Osman, Ghazali
    Abdul Rahman, Safawi
    Mustapha, Muhammad Firdaus
    IEEE ACCESS, 2024, 12 : 73925 - 73935
  • [7] Multi-task Classification Model Based On Multi-modal Glioma Data
    Li, Jialun
    Jin, Yuanyuan
    Yu, Hao
    Wang, Xiaoling
    Zhuang, Qiyuan
    Chen, Liang
    11TH IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG 2020), 2020, : 165 - 172
  • [8] Landmark Classification With Hierarchical Multi-Modal Exemplar Feature
    Zhu, Lei
    Shen, Jialie
    Jin, Hai
    Xie, Liang
    Zheng, Ran
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (07) : 981 - 993
  • [9] Hierarchical Multi-Task Learning for Diagram Question Answering with Multi-Modal Transformer
    Yuan, Zhaoquan
    Peng, Xiao
    Wu, Xiao
    Xu, Changsheng
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1313 - 1321
  • [10] A multi-modal fusion framework based on multi-task correlation learning for cancer prognosis prediction
    Tan, Kaiwen
    Huang, Weixian
    Liu, Xiaofeng
    Hu, Jinlong
    Dong, Shoubin
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 126