Comparing Statistical and Machine Learning Imputation Techniques in Breast Cancer Classification

被引:3
|
作者
Chlioui, Imane [1 ]
Abnane, Ibtissam [1 ]
Idri, Ali [1 ,2 ]
机构
[1] Mohammed V Univ Rabat, ENSIAS, Software Project Management Res Team, Rabat, Morocco
[2] Mohammed VI Polytech Univ, CSEHS MSDA, Ben Guerir, Morocco
关键词
Missing data imputation; Data mining; Breast cancer; MISSING DATA; TUTORIAL; VALUES;
D O I
10.1007/978-3-030-58811-3_5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Missing data imputation is an important task when dealing with crucial data that cannot be discarded such as medical data. This study evaluates and compares the impacts of two statistical and two machine learning imputation techniques when classifying breast cancer patients, using several evaluation metrics. Mean, Expectation-Maximization (EM), Support Vector Regression (SVR) and K-Nearest Neighbor (KNN) were applied to impute 18% of missing data missed completely at random in the two Wisconsin datasets. Thereafter, we empirically evaluated these four imputation techniques when using five classifiers: decision tree (C4.5), Case Based Reasoning (CBR), Random Forest (RF), Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP). In total, 1380 experiments were conducted and the findings confirmed that classification using imputation based machine learning outperformed classification using statistical imputation. Moreover, our experiment showed that SVR was the best imputation method for breast cancer classification.
引用
收藏
页码:61 / 76
页数:16
相关论文
共 50 条
  • [1] The Classification of Breast Cancer with Machine Learning Techniques
    Kolay, Nurdan
    Erdogmus, Pakize
    [J]. 2016 ELECTRIC ELECTRONICS, COMPUTER SCIENCE, BIOMEDICAL ENGINEERINGS' MEETING (EBBT), 2016,
  • [2] Machine Learning Techniques for Classification of Breast Cancer
    Osmanovic, Ahmed
    Halilovic, Sabina
    Ilah, Layla Abdel
    Fojnica, Adnan
    Gromilic, Zehra
    [J]. WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING 2018, VOL 1, 2019, 68 (01): : 197 - 200
  • [3] Comparison on Some Machine Learning Techniques in Breast Cancer Classification
    Mashudi, Nurul Amirah
    Rossli, Syaidathul Amaleena
    Ahmad, Norulhusna
    Noor, Norliza Mohd
    [J]. 2020 IEEE-EMBS CONFERENCE ON BIOMEDICAL ENGINEERING AND SCIENCES (IECBES 2020): LEADING MODERN HEALTHCARE TECHNOLOGY ENHANCING WELLNESS, 2021, : 499 - 504
  • [4] Missing data imputation using statistical and machine learning methods in a real breast cancer problem
    Jerez, Jose M.
    Molina, Ignacio
    Garcia-Laencina, Pedro J.
    Alba, Emilio
    Ribelles, Nuria
    Martin, Miguel
    Franco, Leonardo
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2010, 50 (02) : 105 - 115
  • [5] Evaluation of Machine Learning Classification Algorithms & Missing Data Imputation Techniques
    Nwulu, Nnamdi I.
    [J]. 2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [6] Machine learning techniques for classification of breast tissue
    Helwan, Abdulkader
    Idoko, John Bush
    Abiyev, Rahib H.
    [J]. 9TH INTERNATIONAL CONFERENCE ON THEORY AND APPLICATION OF SOFT COMPUTING, COMPUTING WITH WORDS AND PERCEPTION, ICSCCW 2017, 2017, 120 : 402 - 410
  • [7] Hybridized Machine Learning based Fractal Analysis Techniques for Breast Cancer Classification
    Swain, Munmun
    Kisan, Sumitra
    Chatterjee, Jyotir Moy
    Supramaniam, Mahadevan
    Mohanty, Sachi Nandan
    Jhanjhi, N.Z.
    Abdullah, Azween
    [J]. International Journal of Advanced Computer Science and Applications, 2020, 11 (10): : 179 - 184
  • [8] Hybridized Machine Learning based Fractal Analysis Techniques for Breast Cancer Classification
    Swain, Munmun
    Kisan, Sumitra
    Chatterjee, Jyotir Moy
    Supramaniam, Mahadevan
    Mohanty, Sachi Nandan
    Jhanjhi, N. Z.
    Abdullah, Azween
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (10) : 179 - 184
  • [9] Lung Cancer Survival Prediction via Machine Learning Regression, Classification, and Statistical Techniques
    Bartholomai, James A.
    Frieboes, Hermann B.
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2018, : 632 - 637
  • [10] A Novel Ensemble Bagging Classification Method for Breast Cancer Classification Using Machine Learning Techniques
    Ponnaganti, Naga Deepti
    Anitha, Raju
    [J]. TRAITEMENT DU SIGNAL, 2022, 39 (01) : 229 - 237