eSMC: a statistical model to infer admixture events from individual genomics data

被引:1
|
作者
Wang, Yonghui [1 ,2 ]
Zhao, Zicheng [2 ,3 ]
Miao, Xinyao [3 ,4 ]
Wang, Yinan [4 ,5 ]
Qian, Xiaobo [6 ]
Chen, Lingxi [3 ]
Wang, Changfa [1 ]
Li, Shuaicheng [3 ]
机构
[1] Liaocheng Univ, Liaocheng Res Inst Donkey High Efficiency Breeding, Liaocheng 252059, Peoples R China
[2] Byoryn Technol Co Ltd, Shenzhen 518122, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Kowloon, 83 Tat Chee Ave, Hong Kong, Peoples R China
[4] Xi An Jiao Tong Univ, Sch Forens & Med, Xian 710004, Shaanxi, Peoples R China
[5] Peking Univ, Shenzhen Hosp, Dept Obstet & Gynecol, Shenzhen 518036, Peoples R China
[6] Univ Chinese Acad Sci, BGI Educ Ctr, Shenzhen 518083, Peoples R China
基金
中国国家自然科学基金;
关键词
PSMC; Population Admixture; TMRCA; Domestication; Demographic History; ESTIMATING DEMOGRAPHIC HISTORY; POPULATION HISTORY; SEPARATION HISTORY; ANCESTRY; TIME;
D O I
10.1186/s12864-022-09033-2
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Inferring historical population admixture events yield essential insights in understanding a species demographic history. Methods are available to infer admixture events in demographic history with extant genetic data from multiple sources. Due to the deficiency in ancient population genetic data, there lacks a method for admixture inference from a single source. Pairwise Sequentially Markovian Coalescent (PSMC) estimates the historical effective population size from lineage genomes of a single individual, based on the distribution of the most recent common ancestor between the diploid's alleles. However, PSMC does not infer the admixture event.Results: Here, we proposed eSMC, an extended PSMC model for admixture inference from a single source. We evaluated our model's performance on both in silico data and real data. We simulated population admixture events at an admixture time range from 5 kya to 100 kya (5 years/generation) with population admix ratio at 1:1, 2:1, 3:1, and 4:1, respectively. The root means the square error is +/- 7.61kya for all experiments. Then we implemented our method to infer the historical admixture events in human, donkey and goat populations. The estimated admixture time for both Han and Tibetan individuals range from 60 kya to 80 kya (25 years/generation), while the estimated admixture time for the domesticated donkeys and the goats ranged from 40 kya to 60 kya (8 years/generation) and 40 kya to 100 kya (6 years/generation), respectively. The estimated admixture times were concordance to the time that domestication occurred in human history.Conclusion: Our eSMC effectively infers the time of the most recent admixture event in history from a single individual's genomics data.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] eSMC: a statistical model to infer admixture events from individual genomics data
    Yonghui Wang
    Zicheng Zhao
    Xinyao Miao
    Yinan Wang
    Xiaobo Qian
    Lingxi Chen
    Changfa Wang
    Shuaicheng Li
    BMC Genomics, 23
  • [2] A formal model to infer geographic events from sensor observations
    Devaraju, Anusuriya
    Kuhn, Werner
    Renschler, Chris S.
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2015, 29 (01) : 1 - 27
  • [3] Is it possible to infer the number of colonisation events from genetic data alone?
    Bjorklund, M.
    Almqvist, G.
    ECOLOGICAL INFORMATICS, 2010, 5 (03) : 173 - 176
  • [4] Estimating Individual Admixture Proportions from Next Generation Sequencing Data
    Skotte, Line
    Korneliussen, Thorfinn Sand
    Albrechtsen, Anders
    GENETICS, 2013, 195 (03) : 693 - +
  • [5] Novel statistical methods for integrating genetic and stable isotope data to infer individual-level migratory connectivity
    Rundel, Colin W.
    Wunder, Michael B.
    Alvarado, Allison H.
    Ruegg, Kristen C.
    Harrigan, Ryan
    Schuh, Andrew
    Kelly, Jeffrey F.
    Siegel, Rodney B.
    DeSante, David F.
    Smith, Thomas B.
    Novembre, John
    MOLECULAR ECOLOGY, 2013, 22 (16) : 4163 - 4176
  • [6] A deterministic model to infer gene networks from microarray data
    Nepomuceno-Chamorro, Isabel
    Aguilar-Ruiz, Jesus S.
    Diaz-Diaz, Norberto
    Rodriguez-Baena, Domingo S.
    Garcia, Jorge
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2007, 2007, 4881 : 850 - 859
  • [7] Statistical models to infer gas end-use efficiency in individual dwellings using smart metered data
    Oliver, Ronan
    Duffy, Aidan
    Kilgallon, Ian
    SUSTAINABLE CITIES AND SOCIETY, 2016, 23 : 1 - 10
  • [8] Bayesian framework to infer the Hubble constant from the cross-correlation of individual gravitational wave events with galaxies
    Ghosh, Tathagata
    More, Surhud
    Bera, Sayantani
    Bose, Sukanta
    PHYSICAL REVIEW D, 2025, 111 (06)
  • [9] Model-based Differentially Private Data Synthesis and Statistical Infer- ence in Multiple Synthetic Datasets
    Liu, Fang
    TRANSACTIONS ON DATA PRIVACY, 2022, 15 (03) : 141 - 175
  • [10] Fossil biogeography: a new model to infer dispersal, extinction and sampling from palaeontological data
    Silvestro, Daniele
    Zizka, Alexander
    Bacon, Christine D.
    Cascales-Minana, Borja
    Salamin, Nicolas
    Antonelli, Alexandre
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2016, 371 (1691)