Decision tree based predictive models for breast cancer survivability on imbalanced data

被引:0
|
作者
Liu Ya-Qin [1 ]
Wang Cheng [1 ]
Zhang Lu [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Biomed Engn, Sch Basic Med, Shanghai 200030, Peoples R China
关键词
imbalanced data; decision tree; predictive breast cancer survivability; 10-fold stratified cross-validation; bagging algorithm;
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Based on imbalanced data, the predictive models for 5-year survivability of breast cancer using decision tree are proposed. After data preprocessing from SEER breast cancer datasets, it is obviously that the category of data distribution is imbalanced. Under-sampling is taken to make up the disadvantage of the performance of models caused by the imbalanced data. The performance of the models is evaluated by AUC under ROC curve, accuracy, specificity and sensitivity with 10-fold stratified cross-validation. The performance of models is best while the distribution of data is approximately equal. Bagging algorithm is used to build an integration decision tree model for predicting breast cancer survivability.
引用
收藏
页码:312 / 315
页数:4
相关论文
共 50 条
  • [1] Stage-specific predictive models for breast cancer survivability
    Kate, Rohit J.
    Nadig, Ramya
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2017, 97 : 304 - 311
  • [2] Improving the Accuracy of Predictive Models in Imbalanced Lung Cancer Data
    Patricia, Ariza-Colpas Paola
    Alberto, Pineres-Melo Marlon
    Er-nesto, Barcelo-Martinez
    Alejandra, Blanco-Anillo Sharith
    Camilo, Barcelo-Castellanos
    Roman-Fabian
    [J]. ADVANCES IN SWARM INTELLIGENCE, PT II, ICSI 2024, 2024, 14789 : 219 - 230
  • [3] Predictive Model of Functional Exercise Compliance of Patients with Breast Cancer Based on Decision Tree
    Luo, Zebing
    Luo, Baolin
    Wang, Peiru
    Wu, Jinhua
    Chen, Chujun
    Guo, Zhijun
    Wang, Yiru
    [J]. INTERNATIONAL JOURNAL OF WOMENS HEALTH, 2023, 15 : 397 - 410
  • [4] A Fuzzy Decision Tree Approach for Imbalanced Data Classification
    Sardari, Sahar
    Eftekhari, Mahdi
    [J]. 2016 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2016, : 292 - 297
  • [5] A multivariate decision tree algorithm to mine imbalanced data
    Tsai, Cheng-Jung
    Lee, Chien-I.
    Chen, Chiu-Ting
    Yang, Wei-Pang
    [J]. WSEAS Transactions on Information Science and Applications, 2007, 4 (01): : 50 - 58
  • [6] Robust predictive model for evaluating breast cancer survivability
    Park, Kanghee
    Ali, Amna
    Kim, Dokyoon
    An, Yeolwoo
    Kim, Minkoo
    Shin, Hyunjung
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2194 - 2205
  • [7] Examining characteristics of predictive models with imbalanced big data
    Hasanin, Tawfiq
    Khoshgoftaar, Taghi M.
    Leevy, Joffrey L.
    Seliya, Naeem
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [8] Examining characteristics of predictive models with imbalanced big data
    Tawfiq Hasanin
    Taghi M. Khoshgoftaar
    Joffrey L. Leevy
    Naeem Seliya
    [J]. Journal of Big Data, 6
  • [9] Predictive Models for Imbalanced Data: A School Dropout Perspective
    Barros, Thiago M.
    Souza Neto, Placido A.
    Silva, Ivanovitch
    Guedes, Luiz Affonso
    [J]. EDUCATION SCIENCES, 2019, 9 (04):
  • [10] Visualizing predictive models in decision tree generation
    Baik, S
    Bala, J
    Ahn, S
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 4, 2004, 3046 : 489 - 495