Sequence-Based Machine Learning Reveals 3D Genome Differences between Bonobos and Chimpanzees

被引:1
|
作者
Brand, Colin M. [1 ,2 ]
Kuang, Shuzhen [3 ]
Gilbertson, Erin N. [1 ,4 ]
McArthur, Evonne [5 ,6 ]
Pollard, Katherine S. [1 ,2 ,3 ,7 ]
Webster, Timothy H. [8 ]
Capra, John A. [1 ,2 ,4 ]
机构
[1] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, San Francisco, CA 94143 USA
[2] Univ Calif San Francisco, Dept Epidemiol & Biostat, San Francisco 94143, CA USA
[3] Gladstone Inst Data Sci & Biotechnol, San Francisco, CA USA
[4] Univ Calif San Francisco, Biomed Informat Grad Program, San Francisco 94143, CA USA
[5] Vanderbilt Univ, Vanderbilt Genet Inst, Nashville, TN USA
[6] Univ Washington, Dept Med, Seattle, WA USA
[7] Chan Zuckerberg Biohub, San Francisco, CA USA
[8] Univ Utah, Dept Anthropol, Salt Lake City, UT USA
来源
GENOME BIOLOGY AND EVOLUTION | 2024年 / 16卷 / 11期
基金
美国国家卫生研究院;
关键词
bonobo; chimpanzee; gene regulation; 3D genome folding; machine learning; CHROMATIN DOMAINS; ORGANIZATION; DIVERSITY; EVOLUTION; DATABASE; TOOL;
D O I
10.1093/gbe/evae210
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The 3D structure of the genome is an important mediator of gene expression. As phenotypic divergence is largely driven by gene regulatory variation, comparing genome 3D contacts across species can further understanding of the molecular basis of species differences. However, while experimental data on genome 3D contacts in humans are increasingly abundant, only a handful of 3D genome contact maps exist for other species. Here, we demonstrate that human experimental data can be used to close this data gap. We apply a machine learning model that predicts 3D genome contacts from DNA sequence to the genomes from 56 bonobos and chimpanzees and identify species-specific patterns of genome folding. We estimated 3D divergence between individuals from the resulting contact maps in 4,420 1 Mb genomic windows, of which similar to 17% were substantially divergent in predicted genome contacts. Bonobos and chimpanzees diverged at 89 windows, overlapping genes associated with multiple traits implicated in Pan phenotypic divergence. We discovered 51 bonobo-specific variants that individually produce the observed bonobo contact pattern in bonobo-chimpanzee divergent windows. Our results demonstrate that machine learning methods can leverage human data to fill in data gaps across species, offering the first look at population-level 3D genome variation in nonhuman primates. We also identify loci where changes in 3D folding may contribute to phenotypic differences in our closest living relatives.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Shaping of the 3D genome by the ATPase machine cohesin
    Kim, Yoori
    Yu, Hongtao
    EXPERIMENTAL AND MOLECULAR MEDICINE, 2020, 52 (12): : 1891 - 1897
  • [22] Shaping of the 3D genome by the ATPase machine cohesin
    Yoori Kim
    Hongtao Yu
    Experimental & Molecular Medicine, 2020, 52 : 1891 - 1897
  • [23] Data Augmentation Based on 3D Model Data for Machine Learning
    Iwasaki, Masumi
    Yoshioka, Rentaro
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2019), 2019, : 1 - 4
  • [24] 3D motion analysis based on machine learning in agriculture informatisation
    Xiang, Jian
    NEW ZEALAND JOURNAL OF AGRICULTURAL RESEARCH, 2007, 50 (05) : 583 - 591
  • [25] Diagnostics of 3D Printing on a CNC Machine by Machine Learning
    Kabaldin Y.G.
    Shatagin D.A.
    Anosov M.S.
    Kolchin P.V.
    Kiselev A.V.
    Russian Engineering Research, 2021, 41 (04) : 320 - 324
  • [26] Application of machine learning based genome sequence analysis in pathogen identification
    Gao, Yunqiu
    Liu, Min
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [27] A sequence-based machine learning model for predicting antigenic distance for H3N2 influenza virus
    Li, Xingyi
    Li, Yanyan
    Shang, Xuequn
    Kong, Huihui
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [28] Population-based 3D genome structure analysis reveals driving forces in spatial genome organization
    Tjong, Harianto
    Li, Wenyuan
    Kalhor, Reza
    Dai, Chao
    Hao, Shengli
    Gong, Ke
    Zhou, Yonggang
    Li, Haochen
    Zhou, Xianghong Jasmine
    Le Gros, Mark A.
    Larabell, Carolyn A.
    Chen, Lin
    Alber, Frank
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (12) : E1663 - E1672
  • [29] Sequence-based statistical downscaling and its application to hydrologic simulations based on machine learning and big data
    Wang, Qingrui
    Huang, Jing
    Liu, Ruimin
    Men, Cong
    Guo, Lijia
    Miao, Yuexi
    Jiao, Lijun
    Wang, Yifan
    Shoaib, Muhammad
    Xia, Xinghui
    JOURNAL OF HYDROLOGY, 2020, 586
  • [30] Studying 3D genome evolution using genomic sequence
    Mourad, Raphael
    BIOINFORMATICS, 2020, 36 (05) : 1367 - 1373