Machine Learning Model for Breast Cancer Data Analysis Using Triplet Feature Selection Algorithm

被引:7
|
作者
Dhivya, P. [1 ]
Bazilabanu, A. [1 ]
Ponniah, Thirumalaikolundusubramanian [2 ]
机构
[1] Bannari Amman Inst Technol, Dept Comp Sci & Engn, Erode, India
[2] Trichy SRM Med Coll & Res Ctr, Dept Med, Trichy, India
关键词
Accuracy; benign; correlation; logistic regression; malignant; triplet feature selection; DIAGNOSIS;
D O I
10.1080/03772063.2021.1963861
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The machine learning techniques can be used for clinical investigations in breast cancer diagnosis. The researchers investigated various machine learning algorithms, such as Support Vector Machine, Naive Bayes, Logistic Regression (LR), Random Forest, Decision Tree and K Nearest Neighbor to diagnose the disease. Early detection of breast cancer cells from the features is essential. Feature selection is the process of reducing the input features to improve the performance of the model. This research aims to increase the accuracy, sensitivity, specificity and to reduce the False Positive Rate (FPR) and False Negative Rate (FNR) by feature selection. The proposed feature selection technique is comprised of two phases: feature grouping and feature selection. In the first phase, feature grouping uses the Pearson correlation techniques to identify the correlation among the features and group the features based on high-, medium- and low- level ranking. In the second phase, Triplet Feature Selection (TFS) method has been proposed to avoid collinearity among the features. In this, the features are selected based on the correlation differences in each subset when satisfying the race condition. Finally, select the features in the triplet group and apply LR classification technique to diagnose the disease. The proposed classifier achieved an accuracy (95.4%), FPR (1%), FNR (4%), sensitivity (97%) and specificity (96%) to detect the benign and malign ones. The effects of TFS feature selection with LR classifier were used and the performance of the proposed framework was compared with the existing feature selection methods and classifiers.
引用
收藏
页码:1789 / 1799
页数:11
相关论文
共 50 条
  • [31] Improving the performance of machine learning classifiers for Breast Cancer diagnosis based on feature selection
    Perez, Noel
    Guevara, Miguel A.
    Silva, Augusto
    Ramos, Isabel
    Loureiro, Joana
    FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2014, 2014, 2 : 209 - 217
  • [32] Genetic algorithm-based feature selection with manifold learning for cancer classification using microarray data
    Zixuan Wang
    Yi Zhou
    Tatsuya Takagi
    Jiangning Song
    Yu-Shi Tian
    Tetsuo Shibuya
    BMC Bioinformatics, 24
  • [33] Ensemble Feature Selection for Breast Cancer Classification using Microarray Data
    Hengpraprohm, Supoj
    Jungjit, Suwimol
    INTELIGENCIA ARTIFICIAL-IBEROAMERICAL JOURNAL OF ARTIFICIAL INTELLIGENCE, 2020, 23 (65): : 100 - 114
  • [34] Genetic algorithm-based feature selection with manifold learning for cancer classification using microarray data
    Wang, Zixuan
    Zhou, Yi
    Takagi, Tatsuya
    Song, Jiangning
    Tian, Yu-Shi
    Shibuya, Tetsuo
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [35] Volatile Organic Compounds for the Prediction of Lung Cancer by Using Ensembled Machine Learning Model and Feature Selection
    Khanna, Divya
    Kumar, Arun
    Bhat, Shahid Ahmad
    IEEE ACCESS, 2025, 13 : 9809 - 9820
  • [36] Automated Spam Detection Using Sandpiper Optimization Algorithm-Based Feature Selection with the Machine Learning Model
    Amutha, T.
    Geetha, S.
    IETE JOURNAL OF RESEARCH, 2024, 70 (02) : 1472 - 1479
  • [37] A Data Mining Model To Predict Breast Cancer Using Improved Feature Selection Method On Real Time Data
    Sanjay, Amrita
    Nair, H. Vinayak
    Murali, Sruthy
    Krishnaveni, K. S.
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 2437 - 2440
  • [38] A Comparative Analysis of Feature Selection Methods and Associated Machine Learning Algorithms on Wisconsin Breast Cancer Dataset (WBCD)
    Modi, Nileshkumar
    Ghanchi, Kaushar
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT, ICT4SD 2015, VOL 1, 2016, 408 : 215 - 224
  • [39] Feature-Selection-Based Ransomware Detection with Machine Learning of Data Analysis
    Wan, Yu-Lun
    Chang, Jen-Chun
    Chen, Rong-Jaye
    Wang, Shiuh-Jeng
    PROCEEDINGS OF 2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS), 2018, : 85 - 88
  • [40] Performance evaluation of Indian banks using feature selection data envelopment analysis: A machine learning perspective
    Kumar, Anup
    Shrivastav, Santosh Kumar
    Mukherjee, Kampan
    JOURNAL OF PUBLIC AFFAIRS, 2022, 22 (04)