A Novel Aggregated Multiple Imputation Approach for Enhanced Survival Prediction and Classification on Breast Cancer and Lung Cancer Data

被引:0
|
作者
Deepa, P. [1 ]
Gunavathi, C. [2 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci Engn & Informat Syst, Vellore 632014, India
[2] Vellore Inst Technol, Sch Comp Sci & Engn, Vellore 632014, India
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Imputation; Cancer; Tumors; Data models; Lung cancer; Breast cancer; Predictive models; Accuracy; Prognostics and health management; Lymph nodes; Classification; missing data; multiple imputation; survival analysis;
D O I
10.1109/ACCESS.2024.3516837
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Survival analysis is the method of finding the time of occurrence of an event. Survival analysis is used as a prognostic tool in healthcare especially in diagnosing cancer. Any healthcare data suffers with the missing data problem, survival data is not an exception. Data imputation is the way of handling missing data. In this paper we propose an Aggregated Multiple Imputation technique (AMI) which imputes the missing data with three base imputation techniques like mean imputation, K-nearest neighbour (kNN) imputation and iterative imputation. These techniques were combined by weighted average approach. AMI makes use of the advantages of each method to create imputed values that are more accurate and dependable by using a weighted average approach. The proposed method generates multiple datasets while applying the base imputation techniques. The imputed datasets are then combined using a weighted average, resulting in generation of reliable data by reducing the bias and improving the precision of the imputed value. Breast cancer and lung cancer data from the Surveillance, Epidemiology, and End Results (SEER) program is used for validation of the proposed technique. The imputed data improves the performance of various classifiers and survival prediction models in predicting the overall survival of the cancer patients. The results show that the data imputed using AMI approach improves the performance of the various classifiers and the survival prediction models, compared to the data imputed using the single imputation method. The highest accuracy achieved using the dataset is 91% and the least accuracy is 76% for breast cancer data. The highest accuracy achieved using the dataset is 87% and the least accuracy is 72% for lung cancer data.
引用
收藏
页码:189102 / 189121
页数:20
相关论文
共 50 条
  • [31] Breast, Lung and Liver Cancer Classification from Structured and Unstructured Data
    Gonzalez-Beltran, Beatriz A.
    Reyes-Ortiz, Jose A.
    Montelongo-Gonzalez, Erick E.
    COMPUTACION Y SISTEMAS, 2022, 26 (01): : 233 - 243
  • [32] A Novel Enhanced Gray Scale Adaptive Method for Prediction of Breast Cancer
    Selvi, C.
    Suganthi, M.
    JOURNAL OF MEDICAL SYSTEMS, 2018, 42 (11)
  • [33] A Novel Enhanced Gray Scale Adaptive Method for Prediction of Breast Cancer
    C. Selvi
    M. Suganthi
    Journal of Medical Systems, 2018, 42
  • [34] Enhanced Neutrosophic Set and Machine Learning Approach for Breast Cancer Prediction
    Ashika, T.
    Grace, Hannah
    Martin, Nivetha
    Smarandache, Florentin
    Neutrosophic Sets and Systems, 2024, 73
  • [35] Novel approaches for the prediction of cancer classification
    Chen A.H.
    Lee M.-C.
    International Journal of Advancements in Computing Technology, 2011, 3 (03) : 30 - 39
  • [36] Survival Prediction of Lung Cancer Using Small-Size Clinical Data with a Multiple Task Variational Autoencoder
    Thanh-Hung Vo
    Lee, Guee-Sang
    Yang, Hyung-Jeong
    Oh, In-Jae
    Kim, Soo-Hyung
    Kang, Sae-Ryung
    ELECTRONICS, 2021, 10 (12)
  • [37] Comparing Statistical and Machine Learning Imputation Techniques in Breast Cancer Classification
    Chlioui, Imane
    Abnane, Ibtissam
    Idri, Ali
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2020, PART IV, 2020, 12252 : 61 - 76
  • [38] Benchmarking Classification Models for Cancer Prediction from Gene Expression Data: A Novel Approach and New Findings
    Ramani, R. Geetha
    Jacob, Shomona Gracia
    STUDIES IN INFORMATICS AND CONTROL, 2013, 22 (02): : 133 - 142
  • [39] Prediction of Overall Survival and Novel Classification of Patients with Gastric Cancer Using the Survival Recurrent Network
    Oh, Sung Eun
    Seo, Sung Wook
    Choi, Min-Gew
    Sohn, Tae Sung
    Bae, Jae Moon
    Kim, Sung
    ANNALS OF SURGICAL ONCOLOGY, 2018, 25 (05) : 1153 - 1159
  • [40] Prediction of Overall Survival and Novel Classification of Patients with Gastric Cancer Using the Survival Recurrent Network
    Sung Eun Oh
    Sung Wook Seo
    Min-Gew Choi
    Tae Sung Sohn
    Jae Moon Bae
    Sung Kim
    Annals of Surgical Oncology, 2018, 25 : 1153 - 1159