Generalizable Feature Learning in the Presence of Data Bias and Domain Class Imbalance with Application to Skin Lesion Classification

被引:24
|
作者
Yoon, Chris [1 ]
Hamarneh, Ghassan [2 ]
Garbi, Rafeef [1 ]
机构
[1] Univ British Columbia, BiSICL, Vancouver, BC, Canada
[2] Simon Fraser Univ, Med Image Anal Lab, Burnaby, BC, Canada
关键词
D O I
10.1007/978-3-030-32251-9_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Training generalizable data-driven models for medical imaging applications is especially challenging as acquiring and accessing sufficiently large medical datasets is often unfeasible. When trained on limited datasets, a high capacity model, as most leading neural network architectures are, is likely to overfit and thus generalize poorly to unseen data. Further aggravating the problem, data used to train models in medicine are typically collected in silos and from narrow data distributions that are determined by specific acquisition hardware, imaging protocols, and patient demographics. In addition, class imbalance within and across datasets is a common complication as disease conditions or subtypes have varying degrees of prevalence. In this paper, we motivate the need for generalizable training in the context of skin lesion classification by evaluating the performance of ResNet across 7 public datasets with dataset bias and class imbalance. To mitigate dataset bias, we extend the classification and contrastive semantic alignment (CCSA) loss that aims to learn domain-invariant features. As the CCSA loss requires labelled data from two domains, we propose a strategy to dynamically sample paired data in a setting where the set of available classes varies across domains. To encourage learning from underrepresented classes, the sampled class probabilities are used to weight the classification and alignment losses. Experimental results demonstrate improved generalizability as measured by the mean macro-average recall across the 7 datasets when training using the weighted CCSA loss and dynamic sampler.
引用
收藏
页码:365 / 373
页数:9
相关论文
共 50 条
  • [1] Intra-class consistency and inter-class discrimination feature learning for automatic skin lesion classification
    Wang, Lituan
    Zhang, Lei
    Shu, Xin
    Yi, Zhang
    [J]. MEDICAL IMAGE ANALYSIS, 2023, 85
  • [2] Exploiting domain knowledge to address class imbalance and a heterogeneous feature space in multi-class classification
    Hirsch, Vitali
    Reimann, Peter
    Treder-Tschechlov, Dennis
    Schwarz, Holger
    Mitschang, Bernhard
    [J]. VLDB JOURNAL, 2023, 32 (05): : 1037 - 1064
  • [3] Exploiting domain knowledge to address class imbalance and a heterogeneous feature space in multi-class classification
    Vitali Hirsch
    Peter Reimann
    Dennis Treder-Tschechlov
    Holger Schwarz
    Bernhard Mitschang
    [J]. The VLDB Journal, 2023, 32 : 1037 - 1064
  • [4] Exploiting Domain Knowledge to address Multi-Class Imbalance and a Heterogeneous Feature Space in Classification Tasks for Manufacturing Data
    Hirsch, Vitali
    Reimann, Peter
    Mitschang, Bernhard
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12): : 3258 - 3271
  • [5] Revisiting the Shape-Bias of Deep Learning for Dermoscopic Skin Lesion Classification
    Lucieri, Adriano
    Schmeisser, Fabian
    Balada, Christoph Peter
    Siddiqui, Shoaib Ahmed
    Dengel, Andreas
    Ahmed, Sheraz
    [J]. MEDICAL IMAGE UNDERSTANDING AND ANALYSIS, MIUA 2022, 2022, 13413 : 46 - 61
  • [6] Automatic skin lesion classification based on mid-level feature learning
    Liu, Lina
    Mou, Lichao
    Zhu, Xiao Xiang
    Mandal, Mrinal
    [J]. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2020, 84
  • [7] An Ensemble of Statistical Metadata and CNN Classification of Class Imbalanced Skin Lesion Data
    Nayak, Sachin
    Vincent, Shweta
    Sumathi, K.
    Kumar, Om Prakash
    Pathan, Sameena
    [J]. International Journal of Electronics and Telecommunications, 2022, 68 (02) : 251 - 257
  • [8] An Ensemble of Statistical Metadata and CNN Classification of Class Imbalanced Skin Lesion Data
    Nayak, Sachin
    Vincent, Shweta
    Sumathi, K.
    Kumar, Om Prakash
    Pathan, Sameena
    [J]. INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2021, 68 (02) : 251 - 257
  • [9] Class imbalance in gradient boosting classification algorithms: Application to experimental stroke data
    Lyashevska, Olga
    Malone, Fiona
    MacCarthy, Eugene
    Fiehler, Jens
    Buhk, Jan-Hendrik
    Morris, Liam
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (03) : 916 - 925
  • [10] Culprit-Prune-Net: Efficient Continual Sequential Multi-domain Learning with Application to Skin Lesion Classification
    Bayasi, Nourhan
    Hamarneh, Ghassan
    Garbi, Rafeef
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VII, 2021, 12907 : 165 - 175