Canonical Correlation Analysis for Data Reduction in Data Mining Applied to Predictive Models for Breast Cancer Recurrence

被引:0
|
作者
Razavi, Amir Reza [1 ]
Gill, Hans [1 ]
Ahlfeldt, Hans [1 ]
Shahsavar, Nosrat [1 ]
机构
[1] Linkoping Univ, Dept Biomed Engn, Univ Hosp, S-58185 Linkoping, Sweden
关键词
Data Mining; Artificial Neural Network (ANN); Canonical Correlation Analysis (CCA); Dimension Reduction; Breast Cancer;
D O I
暂无
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Data mining methods can be used for extracting specific medical knowledge such as important predictors for recurrence of breast cancer in pertinent data material. However, when there is a huge quantity of variables in the data material it is first necessary to identify and select important variables. In this study we present a preprocessing method for selecting important variables in a dataset prior to building a predictive model. In the dataset, data from 5787 female patients were, analysed. To cover more predictors and obtain a better assessment of the outcomes, data were retrieved from three different registers: the regional breast cancer, tumour markers, and cause of death registers. After retrieving information about selected predictors and outcomes from the different registers, the raw data were cleaned by running different logical rules. Thereafter, domain experts selected predictors assumed to be important regarding recurrence of breast cancer. After that, Canonical Correlation Analysis (CCA) was applied as a dimension reduction technique to preserve the character of the original data. Artificial Neural Network (ANN) was applied to the resulting dataset for two different analyses with the same settings. Performance of the predictive models was confirmed by ten-fold cross validation. The results showed an increase in the accuracy of the prediction and reduction of the mean absolute error.
引用
收藏
页码:175 / 180
页数:6
相关论文
共 50 条
  • [1] Exploring cancer register data to find risk factors for recurrence of breast cancer - Application of Canonical Correlation Analysis
    Razavi A.R.
    Gill H.
    Stål O.
    Sundquist M.
    Thorstenson S.
    Åhlfeldt H.
    Shahsavar N.
    BMC Medical Informatics and Decision Making, 5 (1)
  • [2] Classification performance of data mining algorithms applied to breast cancer data
    Santos, Vitor
    Datia, Nuno
    Pato, M. P. M.
    COMPUTATIONAL VISION AND MEDICAL IMAGE PROCESSING IV, 2014, : 307 - 312
  • [3] Using Predictive Data Mining Models for Data Analysis in a Logistics Company
    Muchova, Miroslava
    Paralic, Jan
    Nemcik, Michael
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 161 - 170
  • [4] New recurrence prediction model for breast cancer by data mining
    Kim, K. S.
    Kim, W.
    Na, K. Y.
    Park, J. M.
    Kim, J. Y.
    Lee, K. Y.
    Lee, J. E.
    Kim, S. W.
    Park, R. W.
    Jung, Y. S.
    EJC SUPPLEMENTS, 2010, 8 (03): : 98 - 98
  • [5] Sparse Canonical Correlation Analysis Applied to fMRI and Genetic Data Fusion
    Boutte, David
    Liu, Jingyu
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2010, : 422 - 426
  • [6] Generalized Semilinear Canonical Correlation Analysis Applied to the Analysis of Electroencephalogram (EEG) Data
    Brain, P.
    Strimenopoulou, F.
    Ivarsson, M.
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2012, 4 (02): : 149 - 161
  • [7] Prediction Models Applied to Lung Cancer Using Data Mining
    Sousa, Rita
    Sousa, Regina
    Peixoto, Hugo
    Machado, Jose
    INTELLIGENT DISTRIBUTED COMPUTING XV, IDC 2022, 2023, 1089 : 195 - 200
  • [8] Advances in predictive models for data mining
    Hong, SJ
    Weiss, SM
    PATTERN RECOGNITION LETTERS, 2001, 22 (01) : 55 - 61
  • [9] Modeling and comparing data mining algorithms for prediction of recurrence of breast cancer
    Mosayebi, Alireza
    Mojaradi, Barat
    Bonyadi Naeini, Ali
    Khodadad Hosseini, Seyed Hamid
    PLOS ONE, 2020, 15 (10):
  • [10] Association Rule Mining Based Predicting Breast Cancer Recurrence on SEER Breast Cancer Data
    Umesh, D. R.
    Ramachandra, B.
    2015 INTERNATIONAL CONFERENCE ON EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY (ICERECT), 2015, : 376 - 380