Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data

被引:41
|
作者
Xiao, Yawen [1 ]
Wu, Jun [2 ]
Lin, Zongli [3 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
[2] East China Normal Univ, Ctr Bioinformat & Computat Biol, Shanghai 200241, Peoples R China
[3] Univ Virginia, Dept Elect & Comp Engn, Charlottesville, VA 22904 USA
基金
中国国家自然科学基金;
关键词
Cancer diagnosis; Deep learning; Gene expression data; Imbalanced data; Wasserstein generative adversarial networks; CLASSIFICATION; PREDICTION; BREAST;
D O I
10.1016/j.compbiomed.2021.104540
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background and objective: Cancer is a serious global disease due to its high mortality, and the key to effective treatment is accurate diagnosis. However, limited by sampling difficulty and actual sample size in clinical practice, data imbalance is a common problem in cancer diagnosis, while most conventional classification methods assume balanced data distribution. Therefore, addressing the imbalanced learning problem to improve the predictive performance of cancer diagnosis is significant. Methods: In the study, we dissect the data imbalance prevalent in cancer gene expression data and present an improved deep learning based Wasserstein generative adversarial network (WGAN) model, which provides a reliable training progress indicator and deeply explores the characteristics of data. The WGAN generates new samples from the minority class and solves the imbalance problem at the data level. Results: We analyze three publicly available data sets on RNA-seq of three kinds of cancer using the proposed WGAN and compare the results with those from two commonly adopted sampling methods. According to the results, through addressing the data imbalance problem, the balanced data distribution and the expanding sample size increase the prediction accuracy in all three data sets. Conclusions: Therefore, the proposed WGAN method is superior in solving the imbalanced learning problem of gene expression data, providing significantly better prediction performance in cancer diagnosis.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A comparative study of handling imbalanced data using generative adversarial networks for machine learning based software fault prediction
    Phuong, Ha Thi Minh
    Nguyet, Pham Vu Thu
    Minh, Nguyen Huu Nhat
    Hanh, Le Thi My
    Binh, Nguyen Thanh
    APPLIED INTELLIGENCE, 2025, 55 (04)
  • [22] A joint learning method for incomplete and imbalanced data in electronic health record based on generative adversarial networks
    Weng, Xutao
    Song, Hong
    Lin, Yucong
    Wu, You
    Zhang, Xi
    Liu, Bowen
    Yang, Jian
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168
  • [23] Fault Diagnosis of Harmonic Drive With Imbalanced Data Using Generative Adversarial Network
    Yang, Guo
    Zhong, Yong
    Yang, Lie
    Tao, Hui
    Li, Jianying
    Du, Ruxu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [24] Generalization of Deep Neural Networks for Imbalanced Fault Classification of Machinery Using Generative Adversarial Networks
    Wang, Jinrui
    Li, Shunming
    Han, Baokun
    An, Zenghui
    Bao, Huaiqian
    Ji, Shanshan
    IEEE ACCESS, 2019, 7 : 111168 - 111180
  • [25] Emotion Recognition Based on Handwriting Using Generative Adversarial Networks and Deep Learning
    Qi, Hengnian
    Zeng, Gang
    Jia, Keke
    Zhang, Chu
    Wu, Xiaoping
    Li, Mengxia
    Lang, Qing
    Wang, Lingxuan
    IET BIOMETRICS, 2024, 2024
  • [26] Fault diagnosis method based on triple generative adversarial nets for imbalanced data
    Su, Changwei
    Wang, Xueren
    Liu, Ruijie
    Guo, Ziyi
    Sang, Shengtian
    Yu, Shuang
    Zhang, Haifeng
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (03)
  • [27] Imbalanced Fault Diagnosis of Rolling Bearing Using Data Synthesis Based on Multi-Resolution Fusion Generative Adversarial Networks
    Hao, Chuanzhu
    Du, Junrong
    Liang, Haoran
    MACHINES, 2022, 10 (05)
  • [28] Data synthesis using dual discriminator conditional generative adversarial networks for imbalanced fault diagnosis of rolling bearings
    Zheng, Taisheng
    Song, Lei
    Wang, Jianxing
    Teng, Wei
    Xu, Xiaoli
    Ma, Chao
    MEASUREMENT, 2020, 158
  • [29] Crack Detection Based on Generative Adversarial Networks and Deep Learning
    Chen, Gongfa
    Teng, Shuai
    Lin, Mansheng
    Yang, Xiaomei
    Sun, Xiaoli
    KSCE JOURNAL OF CIVIL ENGINEERING, 2022, 26 (04) : 1803 - 1816
  • [30] Synthetic Boosted Resampling Using Deep Generative Adversarial Networks: A Novel Approach to Improve Cancer Prediction from Imbalanced Datasets
    Gurcan, Fatih
    Soylu, Ahmet
    CANCERS, 2024, 16 (23)