Imputation techniques on missing values in breast cancer treatment and fertility data

被引:12
|
作者
Wu, Xuetong [1 ]
Akbarzadeh Khorshidi, Hadi [1 ]
Aickelin, Uwe [1 ]
Edib, Zobaida [2 ]
Peate, Michelle [2 ]
机构
[1] Univ Melbourne, Dept Comp & Informat Syst, Parkville, Vic, Australia
[2] Univ Melbourne, Dept Obstet & Gynaecol, Parkville, Vic, Australia
关键词
Missing data; Imputation; Classification; Breast cancer; Post-treatment amenorrhoea; WOMEN;
D O I
10.1007/s13755-019-0082-4
中图分类号
R-058 [];
学科分类号
摘要
Clinical decision support using data mining techniques offers more intelligent way to reduce the decision error in the last few years. However, clinical datasets often suffer from high missingness, which adversely impacts the quality of modelling if handled improperly. Imputing missing values provides an opportunity to resolve the issue. Conventional imputation methods adopt simple statistical analysis, such as mean imputation or discarding missing cases, which have many limitations and thus degrade the performance of learning. This study examines a series of machine learning based imputation methods and suggests an efficient approach to in preparing a good quality breast cancer (BC) dataset, to find the relationship between BC treatment and chemotherapy-related amenorrhoea, where the performance is evaluated with the accuracy of the prediction. To this end, the reliability and robustness of six well-known imputation methods are evaluated. Our results show that imputation leads to a significant boost in the classification performance compared to the model prediction based on listwise deletion. Furthermore, the results reveal that most methods gain strong robustness and discriminant power even the dataset experiences high missing rate (> 50%).
引用
收藏
页数:8
相关论文
共 50 条
  • [31] IMPUTATION OF MISSING VALUES IN SPATIOTEMPORAL SOLAR-RADIATION DATA
    GLASBEY, CA
    [J]. ENVIRONMETRICS, 1995, 6 (04) : 363 - 371
  • [32] Imputation of Missing Values in the Fundamental Data: Using MICE Framework
    Balasubramaniam Meghanadh
    Lagesh Aravalath
    Bhupesh Joshi
    Raghunathan Sathiamoorthy
    Manish Kumar
    [J]. Journal of Quantitative Economics, 2019, 17 : 459 - 475
  • [33] Imputation of missing values in DNA microarray gene expression data
    Kim, H
    Golub, GH
    Park, H
    [J]. 2004 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2004, : 572 - 573
  • [34] Simple data imputation for missing feature values in binary classification
    Chatterjee, Avishek
    Woodruff, Henry
    Vallieres, Martin
    Seuntjens, Jan
    [J]. MEDICAL PHYSICS, 2019, 46 (11) : 5378 - 5378
  • [35] Imputation of Missing Values in the Fundamental Data: Using MICE Framework
    Meghanadh, Balasubramaniam
    Aravalath, Lagesh
    Joshi, Bhupesh
    Sathiamoorthy, Raghunathan
    Kumar, Manish
    [J]. JOURNAL OF QUANTITATIVE ECONOMICS, 2019, 17 (03) : 459 - 475
  • [36] Imputation of missing values for electronic health record laboratory data
    Li, Jiang
    Yan, Xiaowei S.
    Chaudhary, Durgesh
    Avula, Venkatesh
    Mudiganti, Satish
    Husby, Hannah
    Shahjouei, Shima
    Afshar, Ardavan
    Stewart, Walter F.
    Yeasin, Mohammed
    Zand, Ramin
    Abedi, Vida
    [J]. NPJ DIGITAL MEDICINE, 2021, 4 (01)
  • [37] Impact of imputation of missing values on classification error for discrete data
    Farhangfar, Alireza
    Kurgan, Lukasz
    Dy, Jennifer
    [J]. PATTERN RECOGNITION, 2008, 41 (12) : 3692 - 3705
  • [39] Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques
    Liu, Mingxuan
    Li, Siqi
    Yuan, Han
    Ong, Marcus Eng Hock
    Ning, Yilin
    Xie, Feng
    Saffari, Seyed Ehsan
    Shang, Yuqing
    Volovici, Victor
    Chakraborty, Bibhas
    Liu, Nan
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 142
  • [40] Correlated Cluster-Based Imputation for Treatment of Missing Values
    Myneni, Madhu Bala
    Srividya, Y.
    Dandamudi, Akhil
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS, ICCII 2016, 2017, 507 : 171 - 178