Uncovering and Correcting Shortcut Learning in Machine Learning Models for Skin Cancer Diagnosis

被引:26
|
作者
Nauta, Meike [1 ,2 ]
Walsh, Ricky [1 ]
Dubowski, Adam [1 ]
Seifert, Christin [2 ,3 ]
机构
[1] Univ Twente, Fac EEEMCS, NL-7500 AE Enschede, Netherlands
[2] Univ Duisburg Essen, Inst Artificial Intelligence Med, D-45131 Essen, Germany
[3] Canc Res Ctr Cologne Essen CCCE, D-45147 Essen, Germany
关键词
deep learning; explainable AI; skin cancer diagnosis; inpainting; shortcut learning; model bias; confounding;
D O I
10.3390/diagnostics12010040
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Machine learning models have been successfully applied for analysis of skin images. However, due to the black box nature of such deep learning models, it is difficult to understand their underlying reasoning. This prevents a human from validating whether the model is right for the right reasons. Spurious correlations and other biases in data can cause a model to base its predictions on such artefacts rather than on the true relevant information. These learned shortcuts can in turn cause incorrect performance estimates and can result in unexpected outcomes when the model is applied in clinical practice. This study presents a method to detect and quantify this shortcut learning in trained classifiers for skin cancer diagnosis, since it is known that dermoscopy images can contain artefacts. Specifically, we train a standard VGG16-based skin cancer classifier on the public ISIC dataset, for which colour calibration charts (elliptical, coloured patches) occur only in benign images and not in malignant ones. Our methodology artificially inserts those patches and uses inpainting to automatically remove patches from images to assess the changes in predictions. We find that our standard classifier partly bases its predictions of benign images on the presence of such a coloured patch. More importantly, by artificially inserting coloured patches into malignant images, we show that shortcut learning results in a significant increase in misdiagnoses, making the classifier unreliable when used in clinical practice. With our results, we, therefore, want to increase awareness of the risks of using black box machine learning models trained on potentially biased datasets. Finally, we present a model-agnostic method to neutralise shortcut learning by removing the bias in the training dataset by exchanging coloured patches with benign skin tissue using image inpainting and re-training the classifier on this de-biased dataset.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] The future of skin cancer diagnosis: a comprehensive systematic literature review of machine learning and deep learning models
    Adamu, Shamsuddeen
    Alhussian, Hitham
    Aziz, Norshakirah
    Abdulkadir, Said Jadid
    Alwadin, Ayed
    Imam, Abdullahi Abubakar
    Abdullahi, Mujaheed
    Garba, Aliyu
    Saidu, Yahaya
    [J]. COGENT ENGINEERING, 2024, 11 (01):
  • [2] Diagnosis of skin cancer using machine learning techniques
    Murugan, A.
    Nair, S. Anu H.
    Preethi, A. Angelin Peace
    Kumar, K. P. Sanal
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2021, 81
  • [3] Comparison of Machine Learning Algorithms Used for Skin Cancer Diagnosis
    Bistro, Marta
    Piotrowski, Zbigniew
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [4] A Non-Invasive Interpretable Diagnosis of Melanoma Skin Cancer Using Deep Learning and Ensemble Stacking of Machine Learning Models
    Alfi, Iftiaz A.
    Rahman, Md Mahfuzur
    Shorfuzzaman, Mohammad
    Nazir, Amril
    [J]. DIAGNOSTICS, 2022, 12 (03)
  • [5] Explainable machine learning models for early gastric cancer diagnosis
    Du, Hongyang
    Yang, Qingfen
    Ge, Aimin
    Zhao, Chenhao
    Ma, Yunhua
    Wang, Shuyu
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [6] By artificial intelligence algorithms and machine learning models to diagnosis cancer
    Agarwal, Seema
    Yadav, Ajay Singh
    Dinesh, Vennapoosa
    Vatsav, Kolluru Sai Sri
    Prakash, Kolluru Sai Surya
    Jaiswal, Sushma
    [J]. Materials Today: Proceedings, 2023, 80 : 2969 - 2975
  • [7] Machine Learning and Cancer Diagnosis
    Adamson, Adewole S.
    [J]. SCIENTIST, 2021, 35 (02): : 14 - 15
  • [8] Machine Learning Approaches for Skin Neoplasm Diagnosis
    Asaduzzaman, Abu
    Thompson, Christian C.
    Uddin, Md. Jashim
    [J]. ACS OMEGA, 2024, 9 (30): : 32853 - 32863
  • [9] Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
    Du-Harpur, Xinyi
    Arthurs, Callum
    Ganier, Clarisse
    Woolf, Rick
    Laftah, Zainab
    Lakhan, Manpreet
    Salam, Amr
    Wan, Bo
    Watt, Fiona M.
    Luscombe, Nicholas M.
    Lynch, Magnus D.
    [J]. JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2021, 141 (04) : 916 - 920
  • [10] Uncovering cancer vulnerabilities by machine learning prediction of synthetic lethality
    Benfatto, Salvatore
    Sercin, Ozdemirhan
    Dejure, Francesca R.
    Abdollahi, Amir
    Zenke, Frank T.
    Mardin, Balca R.
    [J]. MOLECULAR CANCER, 2021, 20 (01)