Transfer learning enables predictions in network biology

被引:223
|
作者
Theodoris, Christina V. [1 ,2 ,3 ,4 ]
Xiao, Ling [2 ,5 ]
Chopra, Anant [6 ]
Chaffin, Mark D. [2 ]
Al Sayed, Zeina R. [2 ]
Hill, Matthew C. [2 ,5 ]
Mantineo, Helene [2 ,5 ]
Brydon, Elizabeth M. [6 ]
Zeng, Zexian [1 ,7 ]
Liu, X. Shirley [1 ,7 ,8 ]
Ellinor, Patrick T. [2 ,5 ]
机构
[1] Dana Farber Canc Inst, Dept Data Sci, Boston, MA 02215 USA
[2] Broad Inst MIT & Harvard, Cardiovasc Dis Initiat & Precis Cardiol Lab, Cambridge, MA 02142 USA
[3] Boston Childrens Hosp, Div Genet & Genom, Boston, MA 02115 USA
[4] Harvard Med Sch, Genet Training Program, Boston, MA 02115 USA
[5] Massachusetts Gen Hosp, Cardiovasc Res Ctr, Boston, MA 02114 USA
[6] Bayer US LLC, Precis Cardiol Lab, Cambridge, MA USA
[7] Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA USA
[8] Dana Farber Canc Inst, Ctr Funct Canc Epigenet, Boston, MA USA
基金
美国国家卫生研究院;
关键词
SINGLE-CELL TRANSCRIPTOME; IN-VITRO; DIFFERENTIATION; MUTATIONS; GENES; HETEROGENEITY; TRAJECTORIES; LANDSCAPE; ORGANOIDS; SUBSETS;
D O I
10.1038/s41586-023-06139-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mapping gene networks requires large amounts of transcriptomic data to learn the connections between genes, which impedes discoveries in settings with limited data, including rare diseases and diseases affecting clinically inaccessible tissues. Recently, transfer learning has revolutionized fields such as natural language understanding1,2 and computer vision3 by leveraging deep learning models pretrained on large-scale general datasets that can then be fine-tuned towards a vast array of downstream tasks with limited task-specific data. Here, we developed a context-aware, attention-based deep learning model, Geneformer, pretrained on a large-scale corpus of about 30 million single-cell transcriptomes to enable context-specific predictions in settings with limited data in network biology. During pretraining, Geneformer gained a fundamental understanding of network dynamics, encoding network hierarchy in the attention weights of the model in a completely self-supervised manner. Fine-tuning towards a diverse panel of downstream tasks relevant to chromatin and network dynamics using limited task-specific data demonstrated that Geneformer consistently boosted predictive accuracy. Applied to disease modelling with limited patient data, Geneformer identified candidate therapeutic targets for cardiomyopathy. Overall, Geneformer represents a pretrained deep learning model from which fine-tuning towards a broad range of downstream applications can be pursued to accelerate discovery of key network regulators and candidate therapeutic targets.
引用
收藏
页码:616 / 624
页数:32
相关论文
共 50 条
  • [41] Milling stability predictions under limited samples based on transfer learning
    Deng C.
    Deng Z.
    Zhao Y.
    Sun H.
    Lu S.
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2023, 44 (09): : 311 - 319
  • [42] The technology that enables synchrotron structural biology
    Robert M. Sweet
    Nature Structural Biology, 1998, 5 : 654 - 656
  • [43] Machine Learning Enables Data-Driven Predictions of CO2EOR Numerical Studies
    1600, Society of Petroleum Engineers (SPE) (77):
  • [44] The technology that enables synchrotron structural biology
    Sweet, RM
    NATURE STRUCTURAL BIOLOGY, 1998, 5 (Suppl 8) : 654 - 656
  • [45] Unstable Memories Create a High-Level Representation that Enables Learning Transfer
    Mosha, Neechi
    Robertson, Edwin M.
    CURRENT BIOLOGY, 2016, 26 (01) : 100 - 105
  • [46] Deep Transfer Learning Enables Robust Prediction of Antimicrobial Resistance for Novel Antibiotics
    Ren, Yunxiao
    Chakraborty, Trinad
    Doijad, Swapnil
    Falgenhauer, Linda
    Falgenhauer, Jane
    Goesmann, Alexander
    Schwengers, Oliver
    Heider, Dominik
    ANTIBIOTICS-BASEL, 2022, 11 (11):
  • [47] Transfer learning enables prediction of steel corrosion in concrete under natural environments
    Ji, Haodong
    Tian, Ye
    Fu, Chuanqing
    Ye, Hailong
    CEMENT & CONCRETE COMPOSITES, 2024, 148
  • [48] Deep transfer learning enables battery state of charge and state of health estimation
    Yang, Yongsong
    Xu, Yuchen
    Nie, Yuwei
    Li, Jianming
    Liu, Shizhuo
    Zhao, Lijun
    Yu, Quanqing
    Zhang, Chengming
    ENERGY, 2024, 294
  • [49] Bayesian Network Parameter Learning Method Based on Transfer Learning
    Wang S.
    Guan Z.-X.
    Wang J.
    Sun X.-H.
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2021, 42 (04): : 509 - 515
  • [50] Biology of language: Principle, predictions, and evidence
    Pulvermuller, F
    Mohr, B
    Preissl, H
    BEHAVIORAL AND BRAIN SCIENCES, 1996, 19 (04) : 643 - &