Machine Learning Model for Breast Cancer Data Analysis Using Triplet Feature Selection Algorithm

被引:7
|
作者
Dhivya, P. [1 ]
Bazilabanu, A. [1 ]
Ponniah, Thirumalaikolundusubramanian [2 ]
机构
[1] Bannari Amman Inst Technol, Dept Comp Sci & Engn, Erode, India
[2] Trichy SRM Med Coll & Res Ctr, Dept Med, Trichy, India
关键词
Accuracy; benign; correlation; logistic regression; malignant; triplet feature selection; DIAGNOSIS;
D O I
10.1080/03772063.2021.1963861
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The machine learning techniques can be used for clinical investigations in breast cancer diagnosis. The researchers investigated various machine learning algorithms, such as Support Vector Machine, Naive Bayes, Logistic Regression (LR), Random Forest, Decision Tree and K Nearest Neighbor to diagnose the disease. Early detection of breast cancer cells from the features is essential. Feature selection is the process of reducing the input features to improve the performance of the model. This research aims to increase the accuracy, sensitivity, specificity and to reduce the False Positive Rate (FPR) and False Negative Rate (FNR) by feature selection. The proposed feature selection technique is comprised of two phases: feature grouping and feature selection. In the first phase, feature grouping uses the Pearson correlation techniques to identify the correlation among the features and group the features based on high-, medium- and low- level ranking. In the second phase, Triplet Feature Selection (TFS) method has been proposed to avoid collinearity among the features. In this, the features are selected based on the correlation differences in each subset when satisfying the race condition. Finally, select the features in the triplet group and apply LR classification technique to diagnose the disease. The proposed classifier achieved an accuracy (95.4%), FPR (1%), FNR (4%), sensitivity (97%) and specificity (96%) to detect the benign and malign ones. The effects of TFS feature selection with LR classifier were used and the performance of the proposed framework was compared with the existing feature selection methods and classifiers.
引用
收藏
页码:1789 / 1799
页数:11
相关论文
共 50 条
  • [41] Prediction of lymphedema occurrence in patients with breast cancer using the optimized combination of ensemble learning algorithm and feature selection
    Yaghoobi Notash, Anaram
    Yaghoobi Notash, Aidin
    Omidi, Zahra
    Haghighat, Shahpar
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)
  • [42] Feature selection using improved lion optimisation algorithm for breast cancer classification
    Sudha, M. N.
    Selvarajan, S.
    Suganthi, M.
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2019, 14 (04) : 237 - 246
  • [43] Prediction of lymphedema occurrence in patients with breast cancer using the optimized combination of ensemble learning algorithm and feature selection
    Anaram Yaghoobi Notash
    Aidin Yaghoobi Notash
    Zahra Omidi
    Shahpar Haghighat
    BMC Medical Informatics and Decision Making, 22
  • [44] Breast Cancer Risk Analysis using Machine Learning
    Adane, D. S.
    Kabra, Laxmikant
    Banode, Akansha
    Agrawal, Mansi
    INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2021, 12 (05): : 723 - 731
  • [45] Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm
    Lee, Jaehyeong
    Jang, Hyuk
    Ha, Sungmin
    Yoon, Yourim
    MATHEMATICS, 2021, 9 (21)
  • [46] Optimizing intrusion detection using intelligent feature selection with machine learning model
    Aljehane, Nojood O.
    Mengash, Hanan A.
    Hassine, Siwar B. H.
    Alotaibi, Faiz A.
    Salama, Ahmed S.
    Abdelbagi, Sitelbanat
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 91 : 39 - 49
  • [47] Building a pronominalization model by feature selection and machine learning
    Roh, JE
    Lee, JH
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 566 - 575
  • [48] Algorithm Selection and Model Evaluation in Application Design Using Machine Learning
    Bethu, Srikanth
    Babu, B. Sankara
    Madhavi, K.
    Krishna, P. Gopala
    MACHINE LEARNING FOR NETWORKING (MLN 2019), 2020, 12081 : 175 - 195
  • [49] Classification of Breast Cancer Data Using Machine Learning Algorithms
    Akbugday, Burak
    2019 MEDICAL TECHNOLOGIES CONGRESS (TIPTEKNO), 2019, : 429 - 432
  • [50] Filter-Based Feature Selection and Machine-Learning Classification of Cancer Data
    Farsi, Mohammed
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 28 (01): : 83 - 92