Transfer learning enables predictions in network biology

被引:223
|
作者
Theodoris, Christina V. [1 ,2 ,3 ,4 ]
Xiao, Ling [2 ,5 ]
Chopra, Anant [6 ]
Chaffin, Mark D. [2 ]
Al Sayed, Zeina R. [2 ]
Hill, Matthew C. [2 ,5 ]
Mantineo, Helene [2 ,5 ]
Brydon, Elizabeth M. [6 ]
Zeng, Zexian [1 ,7 ]
Liu, X. Shirley [1 ,7 ,8 ]
Ellinor, Patrick T. [2 ,5 ]
机构
[1] Dana Farber Canc Inst, Dept Data Sci, Boston, MA 02215 USA
[2] Broad Inst MIT & Harvard, Cardiovasc Dis Initiat & Precis Cardiol Lab, Cambridge, MA 02142 USA
[3] Boston Childrens Hosp, Div Genet & Genom, Boston, MA 02115 USA
[4] Harvard Med Sch, Genet Training Program, Boston, MA 02115 USA
[5] Massachusetts Gen Hosp, Cardiovasc Res Ctr, Boston, MA 02114 USA
[6] Bayer US LLC, Precis Cardiol Lab, Cambridge, MA USA
[7] Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA USA
[8] Dana Farber Canc Inst, Ctr Funct Canc Epigenet, Boston, MA USA
基金
美国国家卫生研究院;
关键词
SINGLE-CELL TRANSCRIPTOME; IN-VITRO; DIFFERENTIATION; MUTATIONS; GENES; HETEROGENEITY; TRAJECTORIES; LANDSCAPE; ORGANOIDS; SUBSETS;
D O I
10.1038/s41586-023-06139-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mapping gene networks requires large amounts of transcriptomic data to learn the connections between genes, which impedes discoveries in settings with limited data, including rare diseases and diseases affecting clinically inaccessible tissues. Recently, transfer learning has revolutionized fields such as natural language understanding1,2 and computer vision3 by leveraging deep learning models pretrained on large-scale general datasets that can then be fine-tuned towards a vast array of downstream tasks with limited task-specific data. Here, we developed a context-aware, attention-based deep learning model, Geneformer, pretrained on a large-scale corpus of about 30 million single-cell transcriptomes to enable context-specific predictions in settings with limited data in network biology. During pretraining, Geneformer gained a fundamental understanding of network dynamics, encoding network hierarchy in the attention weights of the model in a completely self-supervised manner. Fine-tuning towards a diverse panel of downstream tasks relevant to chromatin and network dynamics using limited task-specific data demonstrated that Geneformer consistently boosted predictive accuracy. Applied to disease modelling with limited patient data, Geneformer identified candidate therapeutic targets for cardiomyopathy. Overall, Geneformer represents a pretrained deep learning model from which fine-tuning towards a broad range of downstream applications can be pursued to accelerate discovery of key network regulators and candidate therapeutic targets.
引用
收藏
页码:616 / 624
页数:32
相关论文
共 50 条
  • [1] Transfer learning enables predictions in network biology
    Christina V. Theodoris
    Ling Xiao
    Anant Chopra
    Mark D. Chaffin
    Zeina R. Al Sayed
    Matthew C. Hill
    Helene Mantineo
    Elizabeth M. Brydon
    Zexian Zeng
    X. Shirley Liu
    Patrick T. Ellinor
    Nature, 2023, 618 : 616 - 624
  • [2] Transfer learning enables predictions in soil-borne diseases
    Xin, Lei
    Xie, Penghao
    Wen, Tao
    Niu, Guoqing
    Yuan, Jun
    SOIL ECOLOGY LETTERS, 2024, 6 (04)
  • [3] Machine-learning model makes predictions about network biology
    Theodoris, Christina V.
    Ellinor, Patrick T.
    NATURE, 2023,
  • [4] Transfer learning of deep material network for seamless structure–property predictions
    Zeliang Liu
    C. T. Wu
    M. Koishi
    Computational Mechanics, 2019, 64 : 451 - 465
  • [5] Transfer learning of deep material network for seamless structure-property predictions
    Liu, Zeliang
    Wu, C. T.
    Koishi, M.
    COMPUTATIONAL MECHANICS, 2019, 64 (02) : 451 - 465
  • [6] Machine learning for molecular property predictions, and the software ecosystem that enables it
    Hachmann, Johannes
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [7] Biomind - A new biology curriculum that enables authentic inquiry learning
    Zion, M
    Shapira, D
    Slezak, M
    Link, E
    Bashan, N
    Brumer, M
    Orian, T
    Nussinovitch, R
    Agrest, B
    Mendelovici, R
    JOURNAL OF BIOLOGICAL EDUCATION, 2004, 38 (02) : 59 - 67
  • [8] Monthly extended ocean predictions based on a convolutional neural network via the transfer learning method
    Miao, Yonglan
    Zhang, Xuefeng
    Li, Yunbo
    Zhang, Lianxin
    Zhang, Dianjun
    FRONTIERS IN MARINE SCIENCE, 2023, 9
  • [9] Transfer learning for small molecule retention predictions
    Osipenko, Sergey
    Botashev, Kazii
    Nikolaev, Eugene
    Kostyukevich, Yury
    JOURNAL OF CHROMATOGRAPHY A, 2021, 1644
  • [10] PREDICTIONS IN BIOLOGY
    CRICK, DF
    CHEMTECH, 1979, 9 (05) : 298 - 305