A novel K-nearest neighbor classifier for lung cancer disease diagnosis

被引:0
|
作者
Sachdeva, Ravi Kumar [1 ]
Bathla, Priyanka [2 ]
Rani, Pooja [3 ]
Lamba, Rohit [4 ]
Ghantasala, G. S. Pradeep [5 ]
Nassar, Ibrahim F. [6 ]
机构
[1] Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, Rajpura, India
[2] Chandigarh University, Punjab, Gharuan, Mohali, India
[3] MMICTBM, Maharishi Markandeshwar (Deemed to be University), Haryana, Mullana, Ambala, India
[4] Department of Electronics and Communication Engineering, MMEC, Maharishi Markandeshwar (Deemed to be University), Haryana, Mullana, Ambala, India
[5] Department of Computer Science and Engineering, Alliance College of Engineering and Design, Alliance University, Bengaluru, India
[6] Faculty of Specific Education, Ain Shams University, 365 Ramsis Street, Abassia, Cairo, Egypt
关键词
K-near neighbor - Logistics regressions - Lung Cancer - Machine-learning - Naive bayes - Nearest-neighbour - Pearson correlation - Pearson correlation weighted KNN - Random forests - Support vectors machine;
D O I
10.1007/s00521-024-10235-w
中图分类号
学科分类号
摘要
One of the world's deadliest diseases is lung cancer. Based on a few features, machine learning techniques can help in the diagnosis of lung cancer. The performance of several classifiers: support vector machine (SVM), logistic regression (LR), Naïve Bayes (NB), random forest (RF), and K-nearest neighbor (KNN), was evaluated by the authors using the dataset available on Kaggle to create a systematic approach for the diagnosis of lung cancer disease based on readily observable signs and historical medical data without the requirement of CT scan images. The authors have proposed a novel approach for classification called Pearson correlation weighted KNN (PCWKNN), which is a modified version of KNN and uses Pearson correlation coefficient values to determine weights in a weighted KNN. The performance of the classifiers was evaluated using the hold-out validation method. SVM, LR, and RF were 96.77% accurate. NB obtained 95.16% accuracy. KNN achieved 91.93% accuracy. PCWKNN outperformed the employed classifiers and obtained an accuracy of 98.39%. Addressing the imperative for improved model generalization, the researchers utilized PCWKNN on an alternative, more extensive lung cancer dataset and subsequently broadened its application to diverse diseases, including the brain stroke dataset. The encouraging outcomes underscore PCWKNN's resilience and adaptability, suggesting its viability for real-world implementation. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
下载
收藏
页码:22403 / 22416
页数:13
相关论文
共 50 条
  • [11] Exploratory study on classification of lung cancer subtypes through a combined K-nearest neighbor classifier in breathomics
    Chunyan Wang
    Yijing Long
    Wenwen Li
    Wei Dai
    Shaohua Xie
    Yuanling Liu
    Yinchenxi Zhang
    Mingxin Liu
    Yonghui Tian
    Qiang Li
    Yixiang Duan
    Scientific Reports, 10
  • [12] Exploratory study on classification of lung cancer subtypes through a combined K-nearest neighbor classifier in breathomics
    Wang, Chunyan
    Long, Yijing
    Li, Wenwen
    Dai, Wei
    Xie, Shaohua
    Liu, Yuanling
    Zhang, Yinchenxi
    Liu, Mingxin
    Tian, Yonghui
    Li, Qiang
    Duan, Yixiang
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [13] Evaluation of k-Nearest Neighbor classifier performance for direct marketing
    Govindarajan, M.
    Chandrasekaran, R. M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (01) : 253 - 258
  • [14] A Fast k-Nearest Neighbor Classifier Using Unsupervised Clustering
    Vajda, Szilard
    Santosh, K. C.
    RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION (RTIP2R 2016), 2017, 709 : 185 - 193
  • [15] Fuzzy parameterized fuzzy soft k-nearest neighbor classifier
    Memis, S.
    Enginoglu, S.
    Erkan, U.
    NEUROCOMPUTING, 2022, 500 (351-378) : 351 - 378
  • [16] Consistency of the k-Nearest Neighbor Classifier for Spatially Dependent Data
    Younso, Ahmad
    Kanaya, Ziad
    Azhari, Nour
    COMMUNICATIONS IN MATHEMATICS AND STATISTICS, 2023, 11 (03) : 503 - 518
  • [17] A parameter independent fuzzy weighted k-Nearest neighbor classifier
    Biswas, Nimagna
    Chakraborty, Saurajit
    Mullick, Sankha Subhra
    Das, Swagatam
    PATTERN RECOGNITION LETTERS, 2018, 101 : 80 - 87
  • [18] An Algorithm of Incremental Bayesian Classifier Based on K-Nearest Neighbor
    Wang, Dong
    Xiong, Shi-huan
    MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1455 - 1459
  • [19] A fuzzy K-nearest neighbor classifier to deal with imperfect data
    Jose M. Cadenas
    M. Carmen Garrido
    Raquel Martínez
    Enrique Muñoz
    Piero P. Bonissone
    Soft Computing, 2018, 22 : 3313 - 3330
  • [20] K-Nearest Neighbor Classifier for Uncertain Data in Feature Space
    Lim, Sung-Yeon
    Ko, Changwan
    Jeong, Young-Seon
    Baek, Jaeseung
    INDUSTRIAL ENGINEERING AND MANAGEMENT SYSTEMS, 2023, 22 (04): : 414 - 421