OmicPredict: a framework for omics data prediction using ANOVA-Firefly algorithm for feature selection

被引:1
|
作者
Kaur, Parampreet [1 ]
Singh, Ashima [1 ]
Chana, Inderveer [1 ]
机构
[1] Thapar Inst Engn & Technol, Comp Sci & Engn Dept, Patiala, India
关键词
Omics data; deep neural network (DNN); breast cancer; Alzheimer's disease; COVID-19; BREAST-CANCER; CLINICAL-SIGNIFICANCE; TELOMERASE; EXPRESSION; HER2;
D O I
10.1080/10255842.2023.2268236
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
High-throughput technologies and machine learning (ML), when applied to a huge pool of medical data such as omics data, result in efficient analysis. Recent research aims to apply and develop ML models to predict a disease well in time using available omics datasets. The present work proposed a framework, 'OmicPredict', deploying a hybrid feature selection method and deep neural network (DNN) model to predict multiple diseases using omics data. The hybrid feature selection method is developed using the Analysis of Variance (ANOVA) technique and firefly algorithm. The OmicPredict framework is applied to three case studies, Alzheimer's disease, Breast cancer, and Coronavirus disease 2019 (COVID-19). In the case study of Alzheimer's disease, the framework predicts patients using GSE33000 and GSE44770 dataset. In the case study of Breast cancer, the framework predicts human epidermal growth factor receptor 2 (HER2) subtype status using Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset. In the case study of COVID-19, the framework performs patients' classification using GSE157103 dataset. The experimental results show that DNN model achieved an Area Under Curve (AUC) score of 0.949 for the Alzheimer's (GSE33000 and GSE44770) dataset. Furthermore, it achieved an AUC score of 0.987 and 0.989 for breast cancer (METABRIC) and COVID-19 (GSE157103) datasets, respectively, outperforming Random Forest, Naive Bayes models, and the existing research.
引用
收藏
页码:1970 / 1983
页数:14
相关论文
共 50 条
  • [21] Feature selection on educational data using Boruta algorithm
    Anand, Neeyati
    Sehgal, Riya
    Anand, Sanchit
    Kaushik, Ajay
    International Journal of Computational Intelligence Studies, 2021, 10 (01) : 27 - 35
  • [22] Framework for efficient feature selection in genetic algorithm based data mining
    Sikora, Riyaz
    Piramuthu, Selwyn
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 180 (02) : 723 - 737
  • [23] Improving survival prediction using a novel feature selection and feature reduction framework based on the integration of clinical and molecular data
    Neums, Lisa
    Meier, Richard
    Koestler, Devin C.
    Thompson, Jeffrey A.
    PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020, 2020, : 415 - 426
  • [24] Implicit feature selection for omics data phenotype discrimination
    Han, Xiaoxu
    APPLIED SOFT COMPUTING, 2014, 20 : 70 - 82
  • [25] HYPERSPECTRAL BAND SELECTION USING FIREFLY ALGORITHM
    Su, Hongjun
    Li, Qiannan
    Du, Peijun
    2014 6TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2014,
  • [26] Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection
    Ghafoor, Karzan J.
    Taher, Sarkhel H.
    Rawf, Karwan M. Hama
    Abdulrahman, Ayub O.
    ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2025, 13 (01): : 94 - 103
  • [27] Fault feature selection for the identification of compound gear-bearing faults using firefly algorithm
    Athisayam, Andrews
    Kondal, Manisekar
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2023, 125 (3-4): : 1777 - 1788
  • [28] Fault feature selection for the identification of compound gear-bearing faults using firefly algorithm
    Andrews Athisayam
    Manisekar Kondal
    The International Journal of Advanced Manufacturing Technology, 2023, 125 : 1777 - 1788
  • [29] Improved multi-layer binary firefly algorithm for optimizing feature selection and classification of microarray data
    Xie, Weidong
    Wang, Linjie
    Yu, Kun
    Shi, Tengfei
    Li, Wei
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 79
  • [30] Identification of adaptor proteins using the ANOVA feature selection technique
    Wang, Yu-Hao
    Zhang, Yu-Fei
    Zhang, Ying
    Gu, Zhi-Feng
    Zhang, Zhao-Yue
    Lin, Hao
    Deng, Ke-Jun
    METHODS, 2022, 208 : 42 - 47