Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification

被引:0
|
作者
Bin Yang [1 ]
Lei Ding [2 ]
Jianqiang Li [2 ]
Yong Li [2 ]
Guangzhi Qu [2 ]
Jingyi Wang [3 ]
Qiang Wang [2 ]
Bo Liu [2 ]
机构
[1] Academy of Military Science,Center for Strategic Assessment and Consulting
[2] Beijing University of Technology,Faculty of Information Technology
[3] Oakland University,Computer Science and Engineering Department
[4] Massey University,School of Mathematical and Computational Sciences
关键词
Weakly supervised training; Image classification; Multiple instance learning;
D O I
10.1007/s40747-025-01779-y
中图分类号
学科分类号
摘要
Digital medical imaging, particularly pathology images, is essential for cancer diagnosis but faces challenges in direct model training due to its super-resolution nature. Although weakly supervised learning has reduced the need for manual annotations, many multiple instance learning (MIL) methods struggle to effectively capture crucial spatial relationships in histopathological images. Existing methods incorporating positional information often overlook nuanced spatial correlations or use positional encoding strategies that do not fully capture the unique spatial dynamics of pathology images. To address this issue, we propose a new framework named TMIL (Transformer-based Multiple Instance Learning Network with 2D positional encoding), which leverages multiple instance learning for weakly supervised classification of histopathological images. TMIL incorporates a 2D positional encoding module, based on the Transformer, to model positional information and explore correlations between instances. Furthermore, TMIL divides histopathological images into pseudo-bags and trains patch-level feature vectors with deep metric learning to enhance classification performance. Finally, the proposed approach is evaluated on a public colorectal adenoma dataset. The experimental results show that TMIL outperforms existing MIL methods, achieving an AUC of 97.28% and an ACC of 95.19%. These findings suggest that TMIL’s integration of deep metric learning and positional encoding offers a promising approach for improving the efficiency and accuracy of pathology image analysis in cancer diagnosis.
引用
收藏
相关论文
共 50 条
  • [41] FR-MIL: Distribution Re-Calibration-Based Multiple Instance Learning With Transformer for Whole Slide Image Classification
    Chikontwe, Philip
    Kim, Meejeong
    Jeong, Jaehoon
    Sung, Hyun Jung
    Go, Heounjeong
    Nam, Soo Jeong
    Park, Sang Hyun
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (01) : 409 - 421
  • [42] Interactive CNN and Transformer-Based Cross-Attention Fusion Network for Medical Image Classification
    Cai, Shu
    Zhang, Qiude
    Wang, Shanshan
    Hu, Junjie
    Zeng, Liang
    Li, Kaiyan
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2025, 35 (03)
  • [43] ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification
    Stegmuller, Thomas
    Bozorgtabar, Behzad
    Spahr, Antoine
    Thiran, Jean-Philippe
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6159 - 6168
  • [44] TRF-Net: a transformer-based RGB-D fusion network for desktop object instance segmentation
    Cao, He
    Zhang, Yunzhou
    Shan, Dexing
    Liu, Xiaozheng
    Zhao, Jiaqi
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (28): : 21309 - 21330
  • [45] TRF-Net: a transformer-based RGB-D fusion network for desktop object instance segmentation
    He Cao
    Yunzhou Zhang
    Dexing Shan
    Xiaozheng Liu
    Jiaqi Zhao
    Neural Computing and Applications, 2023, 35 : 21309 - 21330
  • [46] 2LSPE: 2D Learnable Sinusoidal Positional Encoding using Transformer for Scene Text Recognition
    Raisi, Zobeir
    Naiel, Mohamed A.
    Younes, Georges
    Wardell, Steven
    Zelek, John
    2021 18TH CONFERENCE ON ROBOTS AND VISION (CRV 2021), 2021, : 119 - 126
  • [47] 2D Footprint Classification Based on Multiple-Module Relation Network
    Zhang Y.
    Wu L.
    Wang N.
    Meng S.
    Hu F.
    Lu X.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2021, 49 (06): : 66 - 76
  • [48] Nuclei-level prior knowledge constrained multiple instance learning for breast histopathology whole slide image classification
    Wang, Xunping
    Yuan, Wei
    ISCIENCE, 2024, 27 (06)
  • [49] Multiple kernel-based multi-instance learning algorithm for image classification
    Li, Daxiang
    Wang, Jing
    Zhao, Xiaoqiang
    Liu, Ying
    Wang, Dianwei
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2014, 25 (05) : 1112 - 1117
  • [50] Multiple-instance learning based decision neural networks for image retrieval and classification
    Xu, Yeong-Yuh
    NEUROCOMPUTING, 2016, 171 : 826 - 836