Feature Selection and Comparative Analysis of Breast Cancer Prediction Using Clinical Data and Histopathological Whole Slide Images

被引:0
|
作者
Mohammed, Sarfaraz Ahmed [1 ]
Abeysinghe, Senuka [2 ]
Ralescu, Anca [1 ]
机构
[1] Univ Cincinnati, Dept Comp Sci, Cincinnati, OH 45221 USA
[2] Indian Hill High Sch, Ohios Coll, Credit Plus Program, Cincinnati, OH 45243 USA
关键词
Breast cancer; Machine learning; Principal component analysis; Particle swarm optimization; Feature selection; Logistic regression; Na & iuml; ve bayes classification; k-NN; Support vector machines; Random forest; K-Means; Whole slide images; TCGA; Histopathology; Deep learning; Digital image analysis; Convolutional neural network; H&E-stained images; Nuclei segmentation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Breast Carcinoma is a common cancer among women, with invasive ductal carcinoma and lobular carcinoma being the two most frequent types. Early detection is critical to prevent cancer from becoming malignant. Diagnostic tests include mammogram, ultrasound, MRI, or biopsy. Machine Learning algorithms can play a key role in analyzing complex clinical datasets to predict disease outcomes. This study uses machine learning and deep learning techniques to analyze publicly available clinical and medical image data. For clinical data, Principal Component Analysis (PCA) and Particle Swarm Optimization (PSO) are applied on the Wisconsin Breast Cancer dataset (WDBC) for feature selection and evaluate the performance of each modality in distinguishing between benign and malignant tumors. The results obtained show that the Random Forest (RF) classifier outperforms other classification algorithms using both PSO and PCA feature selections, achieving predictive accuracies of 95.7% and 97.2% respectively. The first part of the paper contains a comprehensive analysis of the two feature selection methods on clinical data to optimize predictive performance. The second part of the paper is concerned with image data. Although Histopathological Whole Slide Imaging (WSI) has been validated for a variety of pathological applications for over two decades of manual detection of cancerous tumors, it remains challenging and prone to human error. With the potential of deep learning models to aid pathologists in detecting cancer subtypes, and the increasing predictive ability of current image analysis techniques in identifying the underlying genomic data and cancer-causing mutations, the second half of the paper focusses on feature extraction using a deep convolutional neural network (U-Net) trained on WSI's from The Cancer Genome Atlas (TCGA) to accurately classify and extract relevant features. The focus is on feature extraction, nuclei-based instance segmentation, H&E-stained image extraction, and quantifying intensity information for a given WSI to classify the disease type. A comprehensive analysis of feature selection methods is presented for both clinical and medical image data.
引用
收藏
页码:1494 / 1525
页数:32
相关论文
共 50 条
  • [1] Mitosis Extraction in Breast-Cancer Histopathological Whole Slide Images
    Roullier, Vincent
    Lezoray, Olivier
    Ta, Vinh-Thong
    Elmoatazi, Abderrahim
    ADVANCES IN VISUAL COMPUTING, PT I, 2010, 6453 : 539 - +
  • [2] Recurrence risk prediction based on automatic histopathologic analysis of breast cancer using whole slide images
    Lee, Geongyu
    Kim, Chungyeul
    Kwak, Tae-Yeong
    Kim, Sun Woo
    Chang, Hyeyoon
    CANCER RESEARCH, 2022, 82 (12)
  • [3] Detection of Breast Cancer From Whole Slide Histopathological Images Using Deep Multiple Instance CNN
    Das, Kausik
    Conjeti, Sailesh
    Chatterjee, Jyotirmoy
    Sheet, Debdoot
    IEEE ACCESS, 2020, 8 : 213502 - 213511
  • [4] Comparative Analysis of Segmentation Techniques using Histopathological Images of Breast Cancer
    Kaushal, Chetna
    Koundal, Deepika
    Singla, Anshu
    PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 261 - 266
  • [5] Can ImageNet feature maps be applied to small histopathological datasets for the classification of breast cancer metastatic tissue in whole slide images?
    Rai, T.
    Morisi, A.
    Bacci, B.
    Bacon, N. J.
    Thomas, S. A.
    La Ragione, R. M.
    Bober, M.
    Wells, K.
    MEDICAL IMAGING 2019: DIGITAL PATHOLOGY, 2019, 10956
  • [6] WSISA: Making Survival Prediction from Whole Slide Histopathological Images
    Zhu, Xinliang
    Yao, Jiawen
    Zhu, Feiyun
    Huang, Junzhou
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6855 - 6863
  • [7] A Comparative Study for Breast Cancer Prediction using Machine Learning and Feature Selection
    Dhanya, R.
    Paul, Irene Rose
    Akula, Sai Sindhu
    Sivakumar, Madhumathi
    Nair, Jyothisha J.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1049 - 1055
  • [8] Classification of molecular subtypes of breast cancer in whole-slide histopathological images using a novel deep learning algorithm
    Kim, H. S.
    Min, K-W.
    Kim, J. S.
    ANNALS OF ONCOLOGY, 2023, 34 : S1472 - S1472
  • [9] Prediction of drug-induced hepatotoxicity based on histopathological whole slide images
    Su, Ran
    He, Hao
    Sun, Changming
    Wang, Xiaomin
    Liu, Xiaofeng
    METHODS, 2023, 212 : 31 - 38
  • [10] Prediction of homologous recombination status with deep learning on breast cancer whole slide images
    Lazard, Tristan
    Bataillon, Guillaume
    Walter, Thomas
    Vincent Salomon, Anne
    M S-MEDECINE SCIENCES, 2023, 39 (12): : 926 - 928