Classifying the multi-omics data of gastric cancer using a deep feature selection method

被引:25
|
作者
Hu, Yanyu [1 ]
Zhao, Long [1 ]
Li, Zhao [2 ]
Dong, Xiangjun [1 ]
Xu, Tiantian [1 ]
Zhao, Yuhai [3 ]
机构
[1] Qlu Univ Technol, Sch Comp Sci & Technol, Shangdong Acad Sci, Jinan 250353, Peoples R China
[2] Qilu Univ Technol, Shandong Comp Sci Ctr, Shandong Prov Key Lab Comp Networks, Natl Supercomp Ctr Jinan,Shandong Acad Sci, Jinan 250014, Peoples R China
[3] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110819, Peoples R China
基金
中国国家自然科学基金;
关键词
Gastric cancer; Multi-omics data; Feature selection; Neural network; MODELS;
D O I
10.1016/j.eswa.2022.116813
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gastric cancer has the highest incidence among all types of malignant tumors. The rapid development of high throughput gene technology has greatly promoted people's understanding of gastric cancer at the molecular level. However, there is a lack of information in single omics data, so dimensionality reduction is an effective method to overcome the dimensionality disaster of omics data. omics data has the characteristics of being multivariate and high-dimensional, which affects the efficiency of classification. Therefore, dimensionality reduction is an effective method to overcome the dimensionality disaster of omics data. However, neural network learning algorithm is seldom used to improve classification accuracy when feature selection of multiomics data is carried out, therefore, in this study, a random forest deep feature selection (RDFS) algorithm was proposed. By integrating gene expression (Exp) data and copy number variation (CNV) data, the dimensions of multi-omics data were reduced and improve the classification accuracy by using a random forest and deep neural network. The results showed that the accuracy and area under the curve (AUC) of multi-omics data were better than that of single-omics data under the RDFS algorithm. With other feature selection algorithms, RDFS also had a higher prediction accuracy and AUC. We also validated the effect of feature selection on RDFS. Finally, survival analysis was used to evaluate the important genes identified during feature selection and to obtain enrichment gene ontology (GO) terms and biological pathways for these genes.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Classifying Breast Cancer Subtypes Using Deep Neural Networks Based on Multi-Omics Data
    Lin, Yuqi
    Zhang, Wen
    Cao, Huanshen
    Li, Gaoyang
    Du, Wei
    [J]. GENES, 2020, 11 (08) : 1 - 18
  • [2] Benchmark study of feature selection strategies for multi-omics data
    Yingxia Li
    Ulrich Mansmann
    Shangming Du
    Roman Hornung
    [J]. BMC Bioinformatics, 23
  • [3] Benchmark study of feature selection strategies for multi-omics data
    Li, Yingxia
    Mansmann, Ulrich
    Du, Shangming
    Hornung, Roman
    [J]. BMC BIOINFORMATICS, 2022, 23 (01)
  • [4] Multi-omics data of gastric cancer cell lines
    Seo, Eun-Hye
    Shin, Yun-Jae
    Kim, Hee-Jin
    Kim, Jeong-Hwan
    Kim, Yong Sung
    Kim, Seon-Young
    [J]. BMC GENOMIC DATA, 2023, 24 (01):
  • [5] Multi-omics data of gastric cancer cell lines
    Eun-Hye Seo
    Yun-Jae Shin
    Hee-Jin Kim
    Jeong-Hwan Kim
    Yong Sung Kim
    Seon-Young Kim
    [J]. BMC Genomic Data, 24
  • [6] Classifying breast cancer using multi-view graph neural network based on multi-omics data
    Ren, Yanjiao
    Gao, Yimeng
    Du, Wei
    Qiao, Weibo
    Li, Wei
    Yang, Qianqian
    Liang, Yanchun
    Li, Gaoyang
    [J]. FRONTIERS IN GENETICS, 2024, 15
  • [7] Classifying breast cancer subtypes on multi-omics data via sparse canonical correlation analysis and deep learning
    Huang, Yiran
    Zeng, Pingfan
    Zhong, Cheng
    [J]. BMC BIOINFORMATICS, 2024, 25 (01)
  • [8] Classifying breast cancer subtypes on multi-omics data via sparse canonical correlation analysis and deep learning
    Yiran Huang
    Pingfan Zeng
    Cheng Zhong
    [J]. BMC Bioinformatics, 25
  • [9] Novel feature selection method via kernel tensor decomposition for improved multi-omics data analysis
    Taguchi, Y-H
    Turki, Turki
    [J]. BMC MEDICAL GENOMICS, 2022, 15 (01)
  • [10] Novel feature selection method via kernel tensor decomposition for improved multi-omics data analysis
    Y-h. Taguchi
    Turki Turki
    [J]. BMC Medical Genomics, 15