Breast cancer diagnosis through knowledge distillation of Swin transformer-based teacher-student models

被引:0
|
作者
Kolla, Bhavannarayanna [1 ]
Venugopal, P. [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore 632014, Tamil Nadu, India
来源
关键词
teacher model; student model; Swin-transformers; transfer learning; knowledge distillation; breast cancer histopathology;
D O I
10.1088/2632-2153/ad10cc
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Breast cancer is a significant global health concern, emphasizing the crucial need for a timely and accurate diagnosis to enhance survival rates. Traditional diagnostic methods rely on pathologists analyzing whole-slide images (WSIs) to identify and diagnose malignancies. However, this task is complex, demanding specialized expertise and imposing a substantial workload on pathologists. Additionally, existing deep learning models, commonly employed for classifying histopathology images, often need enhancements to ensure their suitability for real-time deployment on WSI, especially when trained for small regions of interest (ROIs). This article introduces two Swin transformer-based architectures: the teacher model, characterized by its moderate size, and the lightweight student model. Both models are trained using a publicly available dataset of breast cancer histopathology images, focusing on ROIs with varying magnification factors. Transfer learning is applied to train the teacher model, and knowledge distillation (KD) transfers its capabilities to the student model. To enhance validation accuracy and minimize the total loss in KD, we employ the state-action-reward-state-action (SARSA) reinforcement learning algorithm. The algorithm dynamically computes temperature and a weighting factor throughout the KD process to achieve high accuracy within a considerably shorter training timeframe. Additionally, the student model is deployed to analyze malignancies in WSI. Despite the student model being only one-third the size and flops of the teacher model, it achieves an impressive accuracy of 98.71%, slightly below the teacher's accuracy of 98.91%. Experimental results demonstrate that the student model can process WSIs at a throughput of 1.67 samples s-1 with an accuracy of 82%. The proposed student model, trained using KD and the SARSA algorithm, exhibits promising breast cancer classification and WSI analysis performance. These findings indicate its potential for assisting pathologists in diagnosing breast cancer accurately and effectively.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Using teacher-student neural networks based on knowledge distillation to detect anomalous samples in the otolith images
    Chen, Yuwen
    Zhu, Guoping
    ZOOLOGY, 2023, 161
  • [22] Teacher-Student Synergetic Knowledge Distillation for Detecting Alcohol Consumption in NIR Iris Images
    Singh, Sanskar
    Patel, Ravil
    Tyagi, Vandit
    Singh, Avantika
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT II, 2023, 14185 : 162 - 171
  • [23] Knowledge in attention assistant for improving generalization in deep teacher-student models
    Morabbi, Sajedeh
    Soltanizadeh, Hadi
    Mozaffari, Saeed
    Fadaeieslam, Mohammad Javad
    Sana, Shib Sankar
    INTERNATIONAL JOURNAL OF MODELLING AND SIMULATION, 2024,
  • [24] An adaptive teacher-student learning algorithm with decomposed knowledge distillation for on-edge intelligence
    Sepahvand, Majid
    Abdali-Mohammadi, Fardin
    Taherkordi, Amir
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [25] Improving Transformer-based Program Repair Models through False Behavior Diagnosis
    Kim, Youngkyoung
    Kim, Misoo
    Leek, Eunseok
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14010 - 14023
  • [26] TC3KD: Knowledge distillation via teacher-student cooperative curriculum customization
    Wang, Chaofei
    Yang, Ke
    Zhang, Shaowei
    Huang, Gao
    Song, Shiji
    NEUROCOMPUTING, 2022, 508 : 284 - 292
  • [27] Knowledge Distillation and Transformer-Based Framework for Automatic Spine CT Report Generation
    Batool, Humaira
    Mukhtar, Asmat
    Gul Khawaja, Sajid
    Alghamdi, Norah Saleh
    Mansoor Khan, Asad
    Qayyum, Adil
    Adil, Ruqqayia
    Khan, Zawar
    Usman Akram, Muhammad
    Usman Akbar, Muhammad
    Eklund, Anders
    IEEE ACCESS, 2025, 13 : 42949 - 42964
  • [28] LRCTNet: A lightweight rectal cancer T-staging network based on knowledge distillation via a pretrained swin transformer
    Yan, Jia
    Liu, Peng
    Xiong, Tingwei
    Han, Mingye
    Jia, Qingzhu
    Gao, Yixing
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 105
  • [29] Research on disease diagnosis based on teacher-student network and Raman spectroscopy
    Chen, Zishuo
    Tian, Xuecong
    Chen, Chen
    Chen, Cheng
    LASERS IN MEDICAL SCIENCE, 2024, 39 (01)
  • [30] Simplified Knowledge Distillation for Deep Neural Networks Bridging the Performance Gap with a Novel Teacher-Student Architecture
    Umirzakova, Sabina
    Abdullaev, Mirjamol
    Mardieva, Sevara
    Latipova, Nodira
    Muksimova, Shakhnoza
    ELECTRONICS, 2024, 13 (22)