Correcting data imbalance for semi-supervised COVID-19 detection using X-ray chest images

被引:20
|
作者
Calderon-Ramirez, Saul [1 ,2 ]
Yang, Shengxiang [1 ]
Moemeni, Armaghan [3 ]
Elizondo, David [1 ]
Colreavy-Donnelly, Simon [1 ]
Chavarria-Estrada, Luis Fernando [4 ]
Molina-Cabello, Miguel A. [5 ,6 ]
机构
[1] De Montfort Univ, Ctr Computat Intelligence CCI, Leicester, Leics, England
[2] Inst Tecnol Costa Rica, Cartago, Costa Rica
[3] Univ Nottingham, Sch Comp Sci, Nottingham, England
[4] Imagenes Med Dr Chavarria Estrada, San Jose, Costa Rica
[5] Univ Malaga, Dept Comp Languages & Comp Sci, Malaga, Spain
[6] Inst Invest Biomed Malaga IBIMA, Malaga, Spain
关键词
Coronavirus; COVID-19; Computer aided diagnosis; Data imbalance; Semi-supervised learning; DEEP; RADIOLOGY; FEATURES;
D O I
10.1016/j.asoc.2021.107692
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A key factor in the fight against viral diseases such as the coronavirus (COVID-19) is the identification of virus carriers as early and quickly as possible, in a cheap and efficient manner. The application of deep learning for image classification of chest X-ray images of COVID-19 patients could become a useful pre-diagnostic detection methodology. However, deep learning architectures require large labelled datasets. This is often a limitation when the subject of research is relatively new as in the case of the virus outbreak, where dealing with small labelled datasets is a challenge. Moreover, in such context, the datasets are also highly imbalanced, with few observations from positive cases of the new disease. In this work we evaluate the performance of the semi-supervised deep learning architecture known as MixMatch with a very limited number of labelled observations and highly imbalanced labelled datasets. We demonstrate the critical impact of data imbalance to the model's accuracy. Therefore, we propose a simple approach for correcting data imbalance, by re-weighting each observation in the loss function, giving a higher weight to the observations corresponding to the under-represented class. For unlabelled observations, we use the pseudo and augmented labels calculated by MixMatch to choose the appropriate weight. The proposed method improved classification accuracy by up to 18%, with respect to the non balanced MixMatch algorithm. We tested our proposed approach with several available datasets using 10, 15 and 20 labelled observations, for binary classification (COVID-19 positive and normal cases). For multi-class classification (COVID-19 positive, pneumonia and normal cases), we tested 30, 50, 70 and 90 labelled observations. Additionally, a new dataset is included among the tested datasets, composed of chest X-ray images of Costa Rican adult patients. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Improved COVID-19 detection with chest x-ray images using deep learning
    Vedika Gupta
    Nikita Jain
    Jatin Sachdeva
    Mudit Gupta
    Senthilkumar Mohan
    Mohd Yazid Bajuri
    Ali Ahmadian
    Multimedia Tools and Applications, 2022, 81 : 37657 - 37680
  • [22] COVID-19 Detection Using Deep Learning Algorithm on Chest X-ray Images
    Akter, Shamima
    Shamrat, F. M. Javed Mehedi
    Chakraborty, Sovon
    Karim, Asif
    Azam, Sami
    BIOLOGY-BASEL, 2021, 10 (11):
  • [23] Detection of COVID-19 from chest x-ray images using transfer learning
    Manokaran, Jenita
    Zabihollahy, Fatemeh
    Hamilton-Wright, Andrew
    Ukwatta, Eranga
    JOURNAL OF MEDICAL IMAGING, 2021, 8 (S1)
  • [24] Improved COVID-19 detection with chest x-ray images using deep learning
    Gupta, Vedika
    Jain, Nikita
    Sachdeva, Jatin
    Gupta, Mudit
    Mohan, Senthilkumar
    Bajuri, Mohd Yazid
    Ahmadian, Ali
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (26) : 37657 - 37680
  • [25] COVID-19 Detection Using Chest X-Ray Images Based on Deep Learning
    Sani, Sudeshna
    Bera, Abhijit
    Mitra, Dipra
    Das, Kalyani Maity
    INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2022, 14 (01):
  • [26] A dataset of COVID-19 x-ray chest images
    Fraiwan, Mohammad
    Khasawneh, Natheer
    Khassawneh, Basheer
    Ibnian, Ali
    DATA IN BRIEF, 2023, 47
  • [27] COVID-19 prognosis using limited chest X-ray images
    Mondal, Arnab Kumar
    APPLIED SOFT COMPUTING, 2022, 122
  • [28] Data Adequacy Bias Impact in a Data-blinded Semi-supervised GAN for Privacy-aware COVID-19 Chest X-Ray Classification
    Pastorino, Javier
    Biswas, Ashis Kumer
    13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022, 2022,
  • [29] COVID-19 anomaly detection and classification method based on supervised machine learning of chest X-ray images
    Hasoon, Jamal N.
    Fadel, Ali Hussein
    Hameed, Rasha Subhi
    Mostafa, Salama A.
    Khalaf, Bashar Ahmed
    Mohammed, Mazin Abed
    Nedoma, Jan
    RESULTS IN PHYSICS, 2021, 31
  • [30] ULNet for the detection of coronavirus (COVID-19) from chest X-ray images
    Wu, Tianbo
    Tang, Chen
    Xu, Min
    Hong, Nian
    Lei, Zhenkun
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 137