A Comparative Analysis of Machine Learning Algorithms for Breast Cancer Detection and Identification of Key Predictive Features

被引:1
|
作者
Kumar, Amit [1 ]
Saini, Rashmi [2 ]
Kumar, Rajeev [3 ]
机构
[1] Uttarakhand Tech Univ, Dept CSE, Dehra Dun 248007, India
[2] G B Pant Inst Engn & Technol, Dept CSE, Pauri Garhwal 246194, India
[3] Teerthanker Mahaveer Univ, Dept CSE, Moradabad 244001, India
关键词
benign feature importance malignant; supervised machine learning; feature selection; feature importance; malignant; SUPPORT VECTOR MACHINE; COMPUTER-AIDED DIAGNOSIS; CLASSIFICATION; CHALLENGES; LESIONS;
D O I
10.18280/ts.410110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer, a disease with numerous subtypes, poses a deadly threat to human life, with the potential for successful clinical treatment heavily reliant on early detection and appropriate treatment planning. The classification of cancer patients into either low or high -risk subgroups is critical. Consequently, various research teams spanning the biomedical and bioinformatics fields have explored the use of Machine Learning (ML) technology in this crucial domain. The impressive capability of ML algorithms to discern significant features in complex datasets underscores their value. In the current study, we propose a framework to detect breast cancer (through benign and malignant categorization) utilizing advanced ML techniques with high accuracy. This framework deploys the Wisconsin Breast Cancer (Diagnostic) dataset. Five supervised ML techniques, namely Decision Tree, Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN), are trained for classification purposes. Out of 569 samples, 70% are allocated for training while the other 30% for testing. A comprehensive evaluation of ML techniques is performed using an array of metrics: precision, recall, specificity, F1 score, classification accuracy, ROC Curve, training time, and feature utilization. Additionally, feature importance is computed for each classifier. The results reveal that the SVM has the maximum accuracy as 97.66%, with an F1 -score of 0.98 for benign and 0.97 for malignant classifications. Conversely, the decision tree registers the minimum performance (94.55%) with an F1 -score of 0.95 for benign and 0.91 for malignant classes. Accuracy scores for RF, XGBoost, and ANN stand at 95.32%, 95.91%, and 97.07%, with corresponding F1 -scores of 0.96, 0.97, and 0.98 for benign and 0.94, 0.95, and 0.96 for malignant respectively. Interestingly, RF and XGBoost exhibited near -equivalent similarly with respect of accuracy measurements. In the context of the area over the ROC curve, SVM outperformed the other ML classifiers and also reported the shortest training time. Conversely, the ANN reported the longest training time.
引用
收藏
页码:127 / 140
页数:14
相关论文
共 50 条
  • [1] A Comparative Analysis of Tree-based Machine Learning Algorithms for Breast Cancer Detection
    A'la, Fiddin Yusfida
    Permanasari, Adhistya Erna
    Setiawan, Noor Akhmad
    [J]. PROCEEDINGS OF 2019 12TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2019, : 55 - 59
  • [2] Comparative Analysis of Machine Learning Algorithms in Breast Cancer Classification
    Satish Chaurasiya
    Ranjit Rajak
    [J]. Wireless Personal Communications, 2023, 131 : 763 - 772
  • [3] Comparative Analysis of Machine Learning Algorithms in Breast Cancer Classification
    Chaurasiya, Satish
    Rajak, Ranjit
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2023, 131 (02) : 763 - 772
  • [4] A comparative survey of Machine Learning classification Algorithms for Breast Cancer Detection
    Kaklamanis, Markos Marios
    Filippakis, Michael E.
    [J]. PROCEEDINGS OF THE 23RD PAN-HELLENIC CONFERENCE OF INFORMATICS (PCI 2019), 2019, : 97 - 103
  • [5] Comparative Study of Machine Learning Algorithms for Breast Cancer Detection and Diagnosis
    Bazazeh, Dana
    Shubair, Raed
    [J]. 2016 5TH INTERNATIONAL CONFERENCE ON ELECTRONIC DEVICES, SYSTEMS AND APPLICATIONS (ICEDSA), 2016,
  • [6] Machine Learning Algorithms for Breast Cancer Detection in Mammography Images: A Comparative Study
    de Miranda Almeida, Rhaylander Mendes
    Chen, Dehua
    da Silva Filho, Agnaldo Lopes
    Brandao, Wladmir Cardoso
    [J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS 2021), VOL 1, 2021, : 660 - 667
  • [7] Comparative Analysis to Predict Breast Cancer using Machine Learning Algorithms: A Survey
    Thomas, Tanishk
    Pradhan, Nitesh
    Dhaka, Vijaypal Singh
    [J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT-2020), 2020, : 192 - 196
  • [8] Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning
    Mikhailova, Valentina
    Anbarjafari, Gholamreza
    [J]. MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2022, 60 (09) : 2589 - 2600
  • [9] Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning
    Mikhailova, Valentina
    Anbarjafari, Gholamreza
    [J]. Medical and Biological Engineering and Computing, 2022, 60 (09): : 2589 - 2600
  • [10] Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning
    Valentina Mikhailova
    Gholamreza Anbarjafari
    [J]. Medical & Biological Engineering & Computing, 2022, 60 : 2589 - 2600