A nearly linear-time general algorithm for genome-wide bi-allele haplotype phasing

被引:0
|
作者
Casey, W
Mishra, B
机构
[1] NYU, Courant Inst Math Sci, New York, NY 10003 USA
[2] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
[3] Tata Inst Fundamental Res, Bombay 400005, Maharashtra, India
来源
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The determination of feature maps, such as STSs (sequence tag sites), SNPs (single nucleotide polymorphisms) or RFLP (restriction fragment length polymorphisms) maps, for each chromosome copy or haplotype in an individual has important potential applications to genetics, clinical biology and association studies. We consider the problem of reconstructing two haplotypes of a diploid individual from genotype data generated by mapping experiments, and present an algorithm to recover haplotypes. The problem of optimizing existing methods of SNP phasing with a population of diploid genotypes has been investigated in [7] and found to be NP-hard. In contrast, using single molecule methods, we show that although haplotypes are not known and data are further confounded by the mapping error model, reasonable assumptions on the mapping process allow us to recover the co-associations of allele types across consecutive loci and estimate the haplotypes with an efficient algorithm. The haplotype reconstruction algorithm requires two stages: Stage I is the detection of polymorphic marker types, this is done by modifying an EM-algorithm for Gaussian mixture models and an example is given for RFLP sizing. Stage II focuses on the problem of phasing and presents a method of local maximum likelihood for the inference of haplotypes in an individual. The algorithm presented is nearly linear in the number of polymorphic loci. The algorithm results, run on simulated RFLP sizing data, are encouraging, and suggest that the method will prove practical for haplotype phasing.
引用
收藏
页码:204 / 215
页数:12
相关论文
共 27 条
  • [1] A Linear-Time Algorithm for the Perfect Phylogeny Haplotype Problem
    Paola Bonizzoni
    Algorithmica, 2007, 48 : 267 - 285
  • [2] A linear-time algorithm for the perfect phylogeny haplotype problem
    Bonizzoni, Paola
    ALGORITHMICA, 2007, 48 (03) : 267 - 285
  • [3] A fast algorithm for genome-wide haplotype pattern mining
    Søren Besenbacher
    Christian NS Pedersen
    Thomas Mailund
    BMC Bioinformatics, 10
  • [4] A fast algorithm for genome-wide haplotype pattern mining
    Besenbacher, Soren
    Pedersen, Christian N. S.
    Mailund, Thomas
    BMC BIOINFORMATICS, 2009, 10
  • [5] A Nearly Linear-Time Distributed Algorithm for Exact Maximum Matching
    Izumi, Taisuke
    Kitamura, Naoki
    Yamaguchi, Yutaro
    PROCEEDINGS OF THE 2024 ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2024, : 4062 - 4082
  • [6] Linear-time superbubble identification algorithm for genome assembly
    Brankovic, Ljiljana
    Iliopoulos, Costas S.
    Kundu, Ritu
    Mohamed, Manal
    Pissis, Solon P.
    Vayani, Fatima
    THEORETICAL COMPUTER SCIENCE, 2016, 609 : 374 - 383
  • [7] Linear-time general decoding algorithm for the surface code
    Darmawan, Andrew S.
    Poulin, David
    PHYSICAL REVIEW E, 2018, 97 (05)
  • [8] A linear-time algorithm for reconstructing zero-recombinant haplotype configuration on a pedigree
    En-Yu Lai
    Wei-Bung Wang
    Tao Jiang
    Kun-Pin Wu
    BMC Bioinformatics, 13
  • [9] A linear-time algorithm for reconstructing zero-recombinant haplotype configuration on a pedigree
    Lai, En-Yu
    Wang, Wei-Bung
    Jiang, Tao
    Wu, Kun-Pin
    BMC BIOINFORMATICS, 2012, 13
  • [10] A Near-Linear Time Algorithm for Haplotype Determination on General Pedigrees
    Doan, Duong D.
    Evans, Patricia A.
    Horton, Joseph D.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2010, 17 (10) : 1451 - 1465