A genome-scale deep learning model to predict gene expression changes of genetic perturbations from multiplex biological networks

被引:1
|
作者
Zhan, Lingmin [1 ]
Wang, Yingdong [1 ]
Wang, Aoyi [1 ]
Zhang, Yuanyuan [1 ]
Cheng, Caiping [1 ]
Zhao, Jinzhong [1 ]
Zhang, Wuxia [1 ]
Chen, Jianxin [2 ]
Li, Peng [1 ]
机构
[1] Shanxi Agr Univ, Coll Basic Sci, 1 Mingxian South Rd, Jinzhong 030801, Peoples R China
[2] Beijing Univ Chinese Med, Sch Tradit Chinese Med, 11 North Third Ring Rd East, Beijing 100029, Peoples R China
关键词
transcriptome; biological network; genetic perturbation; gene function; deep learning; CONNECTIVITY MAP; DISEASE;
D O I
10.1093/bib/bbae433
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Systematic characterization of biological effects to genetic perturbation is essential to the application of molecular biology and biomedicine. However, the experimental exhaustion of genetic perturbations on the genome-wide scale is challenging. Here, we show TranscriptionNet, a deep learning model that integrates multiple biological networks to systematically predict transcriptional profiles to three types of genetic perturbations based on transcriptional profiles induced by genetic perturbations in the L1000 project: RNA interference, clustered regularly interspaced short palindromic repeat, and overexpression. TranscriptionNet performs better than existing approaches in predicting inducible gene expression changes for all three types of genetic perturbations. TranscriptionNet can predict transcriptional profiles for all genes in existing biological networks and increases perturbational gene expression changes for each type of genetic perturbation from a few thousand to 26 945 genes. TranscriptionNet demonstrates strong generalization ability when comparing predicted and true gene expression changes on different external tasks. Overall, TranscriptionNet can systemically predict transcriptional consequences induced by perturbing genes on a genome-wide scale and thus holds promise to systemically detect gene function and enhance drug development and target discovery.
引用
收藏
页数:9
相关论文
共 49 条
  • [41] A New Data-Driven Model to Predict Monthly Runoff at Watershed Scale: Insights from Deep Learning Method Applied in Data-Driven Model
    Jia, Shunqing
    Wang, Xihua
    Xu, Y. Jun
    Liu, Zejun
    Mao, Boyang
    WATER RESOURCES MANAGEMENT, 2024, 38 (13) : 5179 - 5194
  • [42] ICSDA: a multi-modal deep learning model to predict breast cancer recurrence and metastasis risk by integrating pathological, clinical and gene expression data
    Yao, Yuhua
    Lv, Yaping
    Tong, Ling
    Liang, Yuebin
    Xi, Shuxue
    Ji, Binbin
    Zhang, Guanglu
    Li, Ling
    Tian, Geng
    Tang, Min
    Hu, Xiyue
    Li, Shijun
    Yang, Jialiang
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (06)
  • [43] Mathematical-Based Gene Expression Programming (GEP): A Novel Model to Predict Zinc Separation from a Bench-Scale Bioleaching Process
    Hosseini, Shahab
    Javanshir, Sepideh
    Sabeti, Hamid
    Tahmasebizadeh, Parastoo
    JOURNAL OF SUSTAINABLE METALLURGY, 2023, 9 (04) : 1601 - 1619
  • [44] Mathematical-Based Gene Expression Programming (GEP): A Novel Model to Predict Zinc Separation from a Bench-Scale Bioleaching Process
    Shahab Hosseini
    Sepideh Javanshir
    Hamid Sabeti
    Parastoo Tahmasebizadeh
    Journal of Sustainable Metallurgy, 2023, 9 : 1601 - 1619
  • [45] Estimating gene expression from DNA methylation and copy number variation: A deep learning regression model for multi-omics integration
    Seal, Dibyendu Bikash
    Das, Vivek
    Goswami, Saptarsi
    De, Rajat K.
    GENOMICS, 2020, 112 (04) : 2833 - 2841
  • [46] Statistical Learning of Large-Scale Genetic Data: How to Run a Genome-Wide Association Study of Gene-Expression Data Using the 1000 Genomes Project Data
    Sugolov, Anton
    Emmenegger, Eric
    Paterson, Andrew D.
    Sun, Lei
    STATISTICS IN BIOSCIENCES, 2024, 16 (01) : 250 - 264
  • [47] Statistical Learning of Large-Scale Genetic Data: How to Run a Genome-Wide Association Study of Gene-Expression Data Using the 1000 Genomes Project Data
    Anton Sugolov
    Eric Emmenegger
    Andrew D. Paterson
    Lei Sun
    Statistics in Biosciences, 2024, 16 : 250 - 264
  • [48] An interpretable deep-learning architecture of capsule networks for identifying cell-type gene expression programs from single-cell RNA-sequencing data
    Lifei Wang
    Rui Nie
    Zeyang Yu
    Ruyue Xin
    Caihong Zheng
    Zhang Zhang
    Jiang Zhang
    Jun Cai
    Nature Machine Intelligence, 2020, 2 : 693 - 703
  • [49] An interpretable deep-learning architecture of capsule networks for identifying cell-type gene expression programs from single-cell RNA-sequencing data
    Wang, Lifei
    Nie, Rui
    Yu, Zeyang
    Xin, Ruyue
    Zheng, Caihong
    Zhang, Zhang
    Zhang, Jiang
    Cai, Jun
    NATURE MACHINE INTELLIGENCE, 2020, 2 (11) : 693 - +