Applications of Machine Learning Techniques to Predict Diagnostic Breast Cancer

被引:0
|
作者
Chaurasia V. [1 ]
Pal S. [1 ]
机构
[1] Department of Computer Applications, VBS Purvanchal University, Jaunpur
关键词
Classification; Ensemble; k-Nearest neighbors; Linear regression; Machine learning; Multilayer perceptron; Stack; Support vector machine;
D O I
10.1007/s42979-020-00296-8
中图分类号
学科分类号
摘要
This article compares six machine learning (ML) algorithms: Classification and Regression Tree (CART), Support Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbors (KNN), Linear Regression (LR) and Multilayer Perceptron (MLP) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset by estimating their classification test accuracy, standardized data accuracy and runtime analysis. The main objective of this study is to improve the accuracy of prediction using a new statistical method of feature selection. The data set has 32 features, which are reduced using statistical techniques (mode), and the same measurements as above are applied for comparative studies. In the reduced attribute data subset (12 features), we applied 6 integrated models AdaBoost (AB), Gradient Boosting Classifier (GBC), Random Forest (RF), Extra Tree (ET) Bagging and Extra Gradient Boost (XGB), to minimize the probability of misclassification based on any single induced model. We also apply the stacking classifier (Voting Classifier) ​​to basic learners: Logistic Regression (LR), Decision Tree (DT), Support-vector clustering (SVC), K-Nearest Neighbors (KNN), Random Forest (RF) and Naïve Bays (NB) to find out the accuracy obtained by voting classifier (Meta level). To implement the ML algorithm, the data set is divided in the following manner: 80% is used in the training phase and 20% is used in the test phase. To adjust the classifier, manually assigned hyper-parameters are used. At different stages of classification, all ML algorithms perform best, with test accuracy exceeding 90% especially when it is applied to a data subset. © 2020, Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 50 条
  • [41] Machine learning for diagnostic ultrasound of triple-negative breast cancer
    Wu, Tong
    Sultan, Laith R.
    Tian, Jiawei
    Cary, Theodore W.
    Sehgal, Chandra M.
    [J]. BREAST CANCER RESEARCH AND TREATMENT, 2019, 173 (02) : 365 - 373
  • [42] Machine Learning and Oversampling techniques to predict urinary toxicity after prostate cancer
    Mylona, E.
    Filias, F.
    Ibrahim, M.
    Supiot, S.
    Magne, N.
    Crehange, G.
    Hatt, M.
    Acosta, O.
    De Crevoisier, R.
    [J]. RADIOTHERAPY AND ONCOLOGY, 2020, 152 : S829 - S830
  • [43] Machine Learning Techniques to Predict Timeliness of Care among Lung Cancer Patients
    Earnest, Arul
    Tesema, Getayeneh Antehunegn
    Stirling, Robert G.
    [J]. HEALTHCARE, 2023, 11 (20)
  • [44] Application of Machine Learning Techniques to Predict Bone Metastasis in Patients with Prostate Cancer
    Liu, Wen-Cai
    Li, Ming-Xuan
    Qian, Wen-Xing
    Luo, Zhi-Wen
    Liao, Wei-Jie
    Liu, Zhi-Li
    Liu, Jia-Ming
    [J]. CANCER MANAGEMENT AND RESEARCH, 2021, 13 : 8723 - 8736
  • [45] Applications of machine learning techniques to predict filariasis using socio-economic factors
    Kondeti, Phani Krishna
    Ravi, Kumar
    Mutheneni, Srinivasa Rao
    Kadiri, Madhusudhan Rao
    Kumaraswamy, Sriram
    Vadlamani, Ravi
    Upadhyayula, Suryanaryana Murty
    [J]. EPIDEMIOLOGY AND INFECTION, 2019, 147 : e260
  • [46] Utilizing Machine Learning Techniques to Investigate Mammograms for Breast Cancer Detection
    Esfahani, Parsa Riazi
    Maalouf, Maya M.
    Reddy, Akshay J.
    Chawla, Prashant
    [J]. CANCER RESEARCH, 2024, 84 (03)
  • [47] Prediction of Breast Cancer, Comparative Review of Machine Learning Techniques, and Their Analysis
    Fatima, Noreen
    Liu, Li
    Hong, Sha
    Ahmed, Haroon
    [J]. IEEE ACCESS, 2020, 8 : 150360 - 150376
  • [48] Breast cancer identification and prognosis with machine learning techniques - An elucidative review
    Kumar, Mohan
    Khatri, Sunil Kumar
    Mohammadian, Masoud
    [J]. JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2020, 23 (02) : 503 - 521
  • [49] Machine Learning techniques for Prediction from various Breast Cancer Datasets
    Shalini, M.
    Radhika, S.
    [J]. 2020 SIXTH INTERNATIONAL CONFERENCE ON BIO SIGNALS, IMAGES, AND INSTRUMENTATION (ICBSII), 2020,
  • [50] Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques
    Islam M.M.
    Haque M.R.
    Iqbal H.
    Hasan M.M.
    Hasan M.
    Kabir M.N.
    [J]. SN Computer Science, 2020, 1 (5)