Predicting factors for survival of breast cancer patients using machine learning techniques

被引:129
|
作者
Ganggayah, Mogana Darshini [1 ]
Taib, Nur Aishah [2 ]
Har, Yip Cheng [2 ]
Lio, Pietro [3 ]
Dhillon, Sarinder Kaur [1 ]
机构
[1] Univ Malaya, Inst Biol Sci, Fac Sci, Data Sci & Bioinformat Lab, Kuala Lumpur 50603, Malaysia
[2] Univ Malaya, Dept Surg, Fac Med, Kuala Lumpur 50603, Malaysia
[3] Univ Cambridge, Dept Comp Sci & Technol, 15 JJ Thomson Ave, Cambridge CB3 0FD, England
关键词
Data science; Machine learning; Factors influencing survival of breast cancer; Random forest; Decision tree; RANDOM FOREST; LOGISTIC-REGRESSION; NODE DISSECTION; TUMOR SIZE; TREE; CLASSIFICATION; DISEASE; MODEL; NUMBER; TRENDS;
D O I
10.1186/s12911-019-0801-4
中图分类号
R-058 [];
学科分类号
摘要
BackgroundBreast cancer is one of the most common diseases in women worldwide. Many studies have been conducted to predict the survival indicators, however most of these analyses were predominantly performed using basic statistical methods. As an alternative, this study used machine learning techniques to build models for detecting and visualising significant prognostic indicators of breast cancer survival rate.MethodsA large hospital-based breast cancer dataset retrieved from the University Malaya Medical Centre, Kuala Lumpur, Malaysia (n=8066) with diagnosis information between 1993 and 2016 was used in this study. The dataset contained 23 predictor variables and one dependent variable, which referred to the survival status of the patients (alive or dead). In determining the significant prognostic factors of breast cancer survival rate, prediction models were built using decision tree, random forest, neural networks, extreme boost, logistic regression, and support vector machine. Next, the dataset was clustered based on the receptor status of breast cancer patients identified via immunohistochemistry to perform advanced modelling using random forest. Subsequently, the important variables were ranked via variable selection methods in random forest. Finally, decision trees were built and validation was performed using survival analysis.ResultsIn terms of both model accuracy and calibration measure, all algorithms produced close outcomes, with the lowest obtained from decision tree (accuracy=79.8%) and the highest from random forest (accuracy=82.7%). The important variables identified in this study were cancer stage classification, tumour size, number of total axillary lymph nodes removed, number of positive lymph nodes, types of primary treatment, and methods of diagnosis.ConclusionInterestingly the various machine learning algorithms used in this study yielded close accuracy hence these methods could be used as alternative predictive tools in the breast cancer survival studies, particularly in the Asian region. The important prognostic factors influencing survival rate of breast cancer identified in this study, which were validated by survival curves, are useful and could be translated into decision support tools in the medical domain.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Predicting factors for survival of breast cancer patients using machine learning techniques
    Mogana Darshini Ganggayah
    Nur Aishah Taib
    Yip Cheng Har
    Pietro Lio
    Sarinder Kaur Dhillon
    BMC Medical Informatics and Decision Making, 19
  • [2] Predicting factors for psychological distress of breast cancer survivors using machine learning techniques
    Park, H.
    Bae, S. H.
    Kim, H. J.
    ANNALS OF ONCOLOGY, 2024, 35 : S1602 - S1602
  • [3] Predicting Breast Cancer Recurrence Using Machine Learning Techniques: A Systematic Review
    Abreu, Pedro Henriques
    Santos, Miriam Seoane
    Abreu, Miguel Henriques
    Andrade, Bruno
    Silva, Daniel Castro
    ACM COMPUTING SURVEYS, 2016, 49 (03)
  • [4] Predicting diagnosis and survival of bone metastasis in breast cancer using machine learning
    Zhong, Xugang
    Lin, Yanze
    Zhang, Wei
    Bi, Qing
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [5] Predicting diagnosis and survival of bone metastasis in breast cancer using machine learning
    Xugang Zhong
    Yanze Lin
    Wei Zhang
    Qing Bi
    Scientific Reports, 13
  • [6] Application of machine learning techniques for predicting survival in ovarian cancer
    Azar, Amir Sorayaie
    Rikan, Samin Babaei
    Naemi, Amin
    Mohasefi, Jamshid Bagherzadeh
    Pirnejad, Habibollah
    Mohasefi, Matin Bagherzadeh
    Wiil, Uffe Kock
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
  • [7] Application of machine learning techniques for predicting survival in ovarian cancer
    Amir Sorayaie Azar
    Samin Babaei Rikan
    Amin Naemi
    Jamshid Bagherzadeh Mohasefi
    Habibollah Pirnejad
    Matin Bagherzadeh Mohasefi
    Uffe Kock Wiil
    BMC Medical Informatics and Decision Making, 22
  • [8] Survival of patients with pancreatic cancer predicted using machine learning techniques
    Hayward, J.
    Alvarez, S.
    Ruiz, C.
    Tseng, J.
    Sullivan, M.
    Whalen, G. F.
    ANNALS OF SURGICAL ONCOLOGY, 2007, 14 (02) : 114 - 114
  • [9] Predicting Breast Cancer Leveraging Supervised Machine Learning Techniques
    Aamir, Sanam
    Rahim, Aqsa
    Aamir, Zain
    Abbasi, Saadullah Farooq
    Khan, Muhammad Shahbaz
    Alhaisoni, Majed
    Khan, Muhammad Attique
    Khan, Khyber
    Ahmad, Jawad
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2022, 2022
  • [10] Survival analysis of breast cancer patients using machine learning models
    Evangeline, I. Keren
    Kirubha, S. P. Angeline
    Precious, J. Glory
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (20) : 30909 - 30928