Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data

被引:39
|
作者
Delzell, Darcie A. P. [1 ]
Magnuson, Sara [1 ]
Peter, Tabitha [1 ]
Smith, Michelle [1 ]
Smith, Brian J. [2 ]
机构
[1] Wheaton Coll, Dept Math & Comp Sci, Wheaton, IL 60187 USA
[2] Univ Iowa, Dept Biostat, Iowa City, IA USA
来源
FRONTIERS IN ONCOLOGY | 2019年 / 9卷
关键词
radiomics; machine learning; CT image; biomarkers; lung cancer; RADIOMICS; PREDICTION; BIOMARKERS; SIGNATURE;
D O I
10.3389/fonc.2019.01393
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
As awareness of the habits and risks associated with lung cancer has increased, so has the interest in promoting and improving upon lung cancer screening procedures. Recent research demonstrates the benefits of lung cancer screening; the National Lung Screening Trial (NLST) found as its primary result that preventative screening significantly decreases the death rate for patients battling lung cancer. However, it was also noted that the false positive rate was very high (>94%).In this work, we investigated the ability of various machine learning classifiers to accurately predict lung cancer nodule status while also considering the associated false positive rate. We utilized 416 quantitative imaging biomarkers taken from CT scans of lung nodules from 200 patients, where the nodules had been verified as cancerous or benign. These imaging biomarkers were created from both nodule and parenchymal tissue. A variety of linear, nonlinear, and ensemble predictive classifying models, along with several feature selection methods, were used to classify the binary outcome of malignant or benign status. Elastic net and support vector machine, combined with either a linear combination or correlation feature selection method, were some of the best-performing classifiers (average cross-validation AUC near 0.72 for these models), while random forest and bagged trees were the worst performing classifiers (AUC near 0.60). For the best performing models, the false positive rate was near 30%, notably lower than that reported in the NLST.The use of radiomic biomarkers with machine learning methods are a promising diagnostic tool for tumor classification. The have the potential to provide good classification and simultaneously reduce the false positive rate.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data (vol 1, 1393, 2019)
    Delzell, Darcie A. P.
    Magnuson, Sara
    Peter, Tabitha
    Smith, Michelle
    Smith, Brian J.
    [J]. FRONTIERS IN ONCOLOGY, 2020, 10
  • [2] Classification of lung cancer using ensemble-based feature selection and machine learning methods
    Cai, Zhihua
    Xu, Dong
    Zhang, Qing
    Zhang, Jiexia
    Ngai, Sai-Ming
    Shao, Jianlin
    [J]. MOLECULAR BIOSYSTEMS, 2015, 11 (03) : 791 - 800
  • [3] Feature selection and machine learning method for classification of lung cancer types
    Shin, Byungju
    Wang, Bohyun
    Lim, Joon S.
    [J]. Test Engineering and Management, 2019, 81 : 2307 - 2314
  • [4] Brain Neural Data Analysis Using Machine Learning Feature Selection and Classification Methods
    Bozhkov, Lachezar
    Georgieva, Petia
    Trifonov, Roumen
    [J]. ENGINEERING APPLICATIONS OF NEURAL NETWORKS (EANN 2014), 2014, 459 : 123 - 132
  • [5] Feature Extraction, Feature Selection and Machine Learning for Image Classification: A Case Study
    Popescu, Madalina Cosmina
    Sasu, Lucian Mircea
    [J]. 2014 INTERNATIONAL CONFERENCE ON OPTIMIZATION OF ELECTRICAL AND ELECTRONIC EQUIPMENT (OPTIM), 2014, : 968 - 973
  • [6] Machine Learning and Feature Selection Methods for EGFR Mutation Status Prediction in Lung Cancer
    Morgado, Joana
    Pereira, Tania
    Silva, Francisco
    Freitas, Claudia
    Negrao, Eduardo
    de Lima, Beatriz Flor
    da Silva, Miguel Correia
    Madureira, Antonio J.
    Ramos, Isabel
    Hespanhol, Venceslau
    Costa, Jose Luis
    Cunha, Antonio
    Oliveira, Helder P.
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (07):
  • [7] Filter-Based Feature Selection and Machine-Learning Classification of Cancer Data
    Farsi, Mohammed
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 28 (01): : 83 - 92
  • [8] Application of Global Optimization Methods for Feature Selection and Machine Learning
    Wu, Shaohua
    Hu, Yong
    Wang, Wei
    Feng, Xinyong
    Shu, Wanneng
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [9] Machine learning approaches for classification of colorectal cancer with and without feature selection method on microarray data
    Nazari, Elham
    Aghemiri, Mehran
    Avan, Amir
    Mehrabian, Amin
    Tabesh, Hamed
    [J]. GENE REPORTS, 2021, 25
  • [10] Data Classification Using Feature Selection And kNN Machine Learning Approach
    Begum, Shemim
    Chakraborty, Debasis
    Sarkar, Ram
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 811 - 814