Transfer learning enables predictions in network biology

被引:223
|
作者
Theodoris, Christina V. [1 ,2 ,3 ,4 ]
Xiao, Ling [2 ,5 ]
Chopra, Anant [6 ]
Chaffin, Mark D. [2 ]
Al Sayed, Zeina R. [2 ]
Hill, Matthew C. [2 ,5 ]
Mantineo, Helene [2 ,5 ]
Brydon, Elizabeth M. [6 ]
Zeng, Zexian [1 ,7 ]
Liu, X. Shirley [1 ,7 ,8 ]
Ellinor, Patrick T. [2 ,5 ]
机构
[1] Dana Farber Canc Inst, Dept Data Sci, Boston, MA 02215 USA
[2] Broad Inst MIT & Harvard, Cardiovasc Dis Initiat & Precis Cardiol Lab, Cambridge, MA 02142 USA
[3] Boston Childrens Hosp, Div Genet & Genom, Boston, MA 02115 USA
[4] Harvard Med Sch, Genet Training Program, Boston, MA 02115 USA
[5] Massachusetts Gen Hosp, Cardiovasc Res Ctr, Boston, MA 02114 USA
[6] Bayer US LLC, Precis Cardiol Lab, Cambridge, MA USA
[7] Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA USA
[8] Dana Farber Canc Inst, Ctr Funct Canc Epigenet, Boston, MA USA
基金
美国国家卫生研究院;
关键词
SINGLE-CELL TRANSCRIPTOME; IN-VITRO; DIFFERENTIATION; MUTATIONS; GENES; HETEROGENEITY; TRAJECTORIES; LANDSCAPE; ORGANOIDS; SUBSETS;
D O I
10.1038/s41586-023-06139-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mapping gene networks requires large amounts of transcriptomic data to learn the connections between genes, which impedes discoveries in settings with limited data, including rare diseases and diseases affecting clinically inaccessible tissues. Recently, transfer learning has revolutionized fields such as natural language understanding1,2 and computer vision3 by leveraging deep learning models pretrained on large-scale general datasets that can then be fine-tuned towards a vast array of downstream tasks with limited task-specific data. Here, we developed a context-aware, attention-based deep learning model, Geneformer, pretrained on a large-scale corpus of about 30 million single-cell transcriptomes to enable context-specific predictions in settings with limited data in network biology. During pretraining, Geneformer gained a fundamental understanding of network dynamics, encoding network hierarchy in the attention weights of the model in a completely self-supervised manner. Fine-tuning towards a diverse panel of downstream tasks relevant to chromatin and network dynamics using limited task-specific data demonstrated that Geneformer consistently boosted predictive accuracy. Applied to disease modelling with limited patient data, Geneformer identified candidate therapeutic targets for cardiomyopathy. Overall, Geneformer represents a pretrained deep learning model from which fine-tuning towards a broad range of downstream applications can be pursued to accelerate discovery of key network regulators and candidate therapeutic targets.
引用
收藏
页码:616 / 624
页数:32
相关论文
共 50 条
  • [21] Source Selection in Transfer Learning for Improved Service Performance Predictions
    Larsson, Hannes
    Taghia, Jalil
    Moradi, Farnaz
    Johnsson, Andreas
    2021 IFIP NETWORKING CONFERENCE AND WORKSHOPS (IFIP NETWORKING), 2021,
  • [22] One-trial learning: Predictions about negative transfer
    Randall, WE
    Goodman, D
    Dickinson, J
    PSYCHOLOGICAL REPORTS, 1998, 82 (03) : 795 - 802
  • [23] Can transfer learning improve hydrological predictions in the alpine regions?
    Yao, Yingying
    Zhao, Yufeng
    Li, Xin
    Feng, Dapeng
    Shen, Chaopeng
    Liu, Chuankun
    Kuang, Xingxing
    Zheng, Chunmiao
    JOURNAL OF HYDROLOGY, 2023, 625
  • [24] Transfer learning for molecular property predictions from small datasets
    Kirschbaum, Thorren
    Bande, Annika
    AIP ADVANCES, 2024, 14 (10)
  • [25] Interpretable machine learning methods for predictions in systems biology from omics data
    Sidak, David
    Schwarzerova, Jana
    Weckwerth, Wolfram
    Waldherr, Steffen
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2022, 9
  • [26] Deep transfer learning enables lesion tracing of circulating tumor cells
    Xiaoxu Guo
    Fanghe Lin
    Chuanyou Yi
    Juan Song
    Di Sun
    Li Lin
    Zhixing Zhong
    Zhaorun Wu
    Xiaoyu Wang
    Yingkun Zhang
    Jin Li
    Huimin Zhang
    Feng Liu
    Chaoyong Yang
    Jia Song
    Nature Communications, 13
  • [27] Deep transfer learning enables lesion tracing of circulating tumor cells
    Guo, Xiaoxu
    Lin, Fanghe
    Yi, Chuanyou
    Song, Juan
    Sun, Di
    Lin, Li
    Zhong, Zhixing
    Wu, Zhaorun
    Wang, Xiaoyu
    Zhang, Yingkun
    Li, Jin
    Zhang, Huimin
    Liu, Feng
    Yang, Chaoyong
    Song, Jia
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [28] Transfer Learning Predicts Network Biology for Fibroblast Growth Factor 23 (FGF-23) and Cardiac Disease
    Perwad, Farzana
    Akwo, Elvis Abang
    Robinson-Cohen, Cassianne
    JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2024, 35 (10):
  • [29] Computational biology - Protein predictions
    Dodson, Eleanor J.
    NATURE, 2007, 450 (7167) : 176 - 177
  • [30] EMUNE: Architecture for Mobile Data Transfer Scheduling with Network Availability Predictions
    Rathnayake, Upendra
    Petander, Henrik
    Ott, Maximilian
    Seneviratne, Aruna
    MOBILE NETWORKS & APPLICATIONS, 2012, 17 (02): : 216 - 233