Learning influential genes on cancer gene expression data with stacked denoising autoencoders

被引:0
|
作者
Teixeira, Vitor [1 ]
Camacho, Rui [2 ]
Ferreira, Pedro G. [3 ]
机构
[1] Univ Porto, Fac Engn, MIEIC, Porto, Portugal
[2] Univ Porto, Fac Engn, DEI, Porto, Portugal
[3] Univ Porto, Inst Invest & Inovacao Saude I3S, Ipatimup Inst Mol Pathol & Immunol, Porto, Portugal
关键词
Deep Learning; Gene Expression Analysis; Knowledge Extraction; RNA-Seq; Cancer; NETWORK; BREAST;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cancer genome projects are characterizing the genome, epigenome and transcriptome of a large number of samples using the latest high-throughput sequencing assays. The generated data sets pose several challenges for traditional statistical and machine learning methods. In this work we are interested in the task of deriving the most informative genes from a cancer gene expression data set. For that goal we built denoising autoencoders (DAE) and stacked denoising autoencoders and we studied the influence of the input nodes on the final representation of the DAE. We have also compared these deep learning approaches with other existing approaches. Our study is divided into two main tasks. First, we built and compared the performance of several feature extraction methods as well as data sampling methods using classifiers that were able to distinguish the samples of thyroid cancer patients from samples of healthy persons. In the second task, we have investigated the possibility of building comprehensible descriptions of gene expression data by using Denoising Autoencoders and Stacked Denoising Autoencoders as feature extraction methods. After extracting information related to the description built by the network, namely the connection weights, we devised post-processing techniques to extract comprehensible and biologically meaningful descriptions out of the constructed models. We have been able to build high accuracy models to discriminate thyroid cancer from healthy patients but the extraction of comprehensible models is still very limited.
引用
收藏
页码:1201 / 1205
页数:5
相关论文
共 50 条
  • [21] Finding Influential Genes Using Gene Expression Data and Boolean Models of Metabolic Networks
    Tamura, Takeyuki
    Akutsu, Tatsuya
    Lin, Chun-Yu
    Yang, Jinn-Moon
    [J]. 2016 IEEE 16TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2016, : 57 - 63
  • [22] AN ADAPTIVE WAVELET DENOISING AND CLASSIFICATION OF MULTICLASS CANCER GENE EXPRESSION DATA
    Bhaskar, Navya S.
    Devi, Aswathy T.
    [J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, INSTRUMENTATION AND CONTROL TECHNOLOGIES (ICICICT), 2017, : 780 - 786
  • [23] Stacked Multilevel-Denoising Autoencoders: A New Representation Learning Approach for Wind Turbine Gearbox Fault Diagnosis
    Jiang, Guoqian
    He, Haibo
    Xie, Ping
    Tang, Yufei
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2017, 66 (09) : 2391 - 2402
  • [24] Sparsity-Penalized Stacked Denoising Autoencoders for Imputing Single-Cell RNA-seq Data
    Chi, Weilai
    Deng, Minghua
    [J]. GENES, 2020, 11 (05)
  • [25] Daily Activity Recognition and Tremor Quantification from Accelerometer Data for Patients with Essential Tremor Using Stacked Denoising Autoencoders
    Qin Ni
    Zhuo Fan
    Lei Zhang
    Bo Zhang
    Xiaochen Zheng
    Yuping Zhang
    [J]. International Journal of Computational Intelligence Systems, 15
  • [26] Daily Activity Recognition and Tremor Quantification from Accelerometer Data for Patients with Essential Tremor Using Stacked Denoising Autoencoders
    Ni, Qin
    Fan, Zhuo
    Zhang, Lei
    Zhang, Bo
    Zheng, Xiaochen
    Zhang, Yuping
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2022, 15 (01)
  • [27] On extraction of cancer informative genes and gene expression data mining
    Yu B.
    Wang Q.
    Wang X.
    Li S.
    Lou L.
    Qiu W.
    [J]. Yu, Bin, 1600, American Scientific Publishers (10): : 293 - 299
  • [28] Hub Genes Identification in Brain Cancer with Gene Expression Data
    Senadheera, S. P. B. M.
    Weerasinghe, A. R.
    [J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 125 - 130
  • [29] DEEP LEARNING FOR PRECISION MEDICINE: STACKED AUTOENCODERS OVERCOME CLASSIFICATION IMBALANCE IN GENE EXPRESSION PROFILING OF SYSTEMIC LUPUS ERYTHEMATOSUS TREATMENTS.
    Bhatnagar, R.
    Hu, V. J. Y.
    Ratnagiri, M.
    Breitenstein, M. K.
    [J]. CLINICAL PHARMACOLOGY & THERAPEUTICS, 2019, 105 : S86 - S86
  • [30] Identifying Driver Genes in Cancer by Triangulating Gene Expression, Gene Location, and Survival Data
    Rouam, Sigrid
    Miller, Lance
    Karuturi, R.
    [J]. CANCER INFORMATICS, 2014, 13 : 35 - 48