A nearly linear-time general algorithm for genome-wide bi-allele haplotype phasing

被引:0
|
作者
Casey, W
Mishra, B
机构
[1] NYU, Courant Inst Math Sci, New York, NY 10003 USA
[2] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
[3] Tata Inst Fundamental Res, Bombay 400005, Maharashtra, India
来源
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The determination of feature maps, such as STSs (sequence tag sites), SNPs (single nucleotide polymorphisms) or RFLP (restriction fragment length polymorphisms) maps, for each chromosome copy or haplotype in an individual has important potential applications to genetics, clinical biology and association studies. We consider the problem of reconstructing two haplotypes of a diploid individual from genotype data generated by mapping experiments, and present an algorithm to recover haplotypes. The problem of optimizing existing methods of SNP phasing with a population of diploid genotypes has been investigated in [7] and found to be NP-hard. In contrast, using single molecule methods, we show that although haplotypes are not known and data are further confounded by the mapping error model, reasonable assumptions on the mapping process allow us to recover the co-associations of allele types across consecutive loci and estimate the haplotypes with an efficient algorithm. The haplotype reconstruction algorithm requires two stages: Stage I is the detection of polymorphic marker types, this is done by modifying an EM-algorithm for Gaussian mixture models and an example is given for RFLP sizing. Stage II focuses on the problem of phasing and presents a method of local maximum likelihood for the inference of haplotypes in an individual. The algorithm presented is nearly linear in the number of polymorphic loci. The algorithm results, run on simulated RFLP sizing data, are encouraging, and suggest that the method will prove practical for haplotype phasing.
引用
收藏
页码:204 / 215
页数:12
相关论文
共 27 条
  • [21] Genome-wide association mapping of growth dynamics detects time-specific and general quantitative trait loci
    Bac-Molenaar, Johanna A.
    Vreugdenhil, Dick
    Granier, Christine
    Keurentjes, Joost J. B.
    JOURNAL OF EXPERIMENTAL BOTANY, 2015, 66 (18) : 5567 - 5580
  • [22] Correction: Genome-wide allele and haplotype-sharing patterns suggested one unique Hmong–Mein-related lineage and biological adaptation history in Southwest China
    Guanglin He
    Jiawen Wang
    Lin Yang
    Shuhan Duan
    Qiuxia Sun
    Youjing Li
    Jun Wu
    Wenxin Wu
    Zheng Wang
    Yan Liu
    Renkuan Tang
    Junbao Yang
    Chao Liu
    Buhong Yuan
    Daoyong Wang
    Jianwei Xu
    Mengge Wang
    Human Genomics, 17
  • [23] A Genome-Wide two-Component Mixture Model Expectation-Maximization Algorithm for Time to Event Data
    Francis, Ben
    Yin, Peng
    Cook, James
    Jorgensen, Andrea
    Hutton, Jane
    Morris, Andrew
    GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 637 - 637
  • [24] A Genome-Wide Two-Component Mixture Model Expectation-Maximisation Algorithm for Time to Event Data
    Francis, Ben
    Yin, Peng
    Cook, James P.
    Jorgensen, Andrea L.
    Hutton, Jane
    Morris, Andrew P.
    HUMAN HEREDITY, 2016, 81 (04) : 212 - 213
  • [25] Correction: Genome-wide allele and haplotype-sharing patterns suggested one unique Hmong-Mein-related lineage and biological adaptation history in Southwest China (vol 17, 3, 2023)
    He, Guanglin
    Wang, Jiawen
    Yang, Lin
    Duan, Shuhan
    Sun, Qiuxia
    Li, Youjing
    Wu, Jun
    Wu, Wenxin
    Wang, Zheng
    Liu, Yan
    Tang, Renkuan
    Yang, Junbao
    Liu, Chao
    Yuan, Buhong
    Wang, Daoyong
    Xu, Jianwei
    Wang, Mengge
    HUMAN GENOMICS, 2023, 17 (01)
  • [26] Identification of Pou5f1, Sox2, and Nanog downstream target genes with statistical confidence by applying a novel algorithm to time course microarray and genome-wide chromatin immunoprecipitation data
    Alexei A Sharov
    Shinji Masui
    Lioudmila V Sharova
    Yulan Piao
    Kazuhiro Aiba
    Ryo Matoba
    Li Xin
    Hitoshi Niwa
    Minoru SH Ko
    BMC Genomics, 9
  • [27] Identification of Pou5f1, Sox2, and Nanog downstream target genes with statistical confidence by applying a novel algorithm to time course microarray and genome-wide chromatin immunoprecipitation data
    Sharov, Alexei A.
    Masui, Shinji
    Sharova, Lioudmila V.
    Piao, Yulan
    Aiba, Kazuhiro
    Matoba, Ryo
    Xin, Li
    Niwa, Hitoshi
    Ko, Minoru S. H.
    BMC GENOMICS, 2008, 9 (1)