Efficient Genome-Wide TagSNP Selection Across Populations via the Linkage Disequilibrium Criterion

被引:7
|
作者
Liu, Lan [1 ,2 ]
Wu, Yonghui [1 ,2 ]
Lonardi, Stefano [1 ]
Jiang, Tao [1 ]
机构
[1] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92507 USA
[2] Google Inc, Mountain View, CA USA
关键词
genome-wide tagSNP selection; greedy algorithm; HapMap; Lagrangian relaxation; linkage disequilibrium; multiple populations; SINGLE-NUCLEOTIDE POLYMORPHISMS; HAPLOTYPE-TAGGING SNPS; SET; BLOCKS; ASSOCIATION; ALGORITHM; PATTERNS; MAP;
D O I
10.1089/cmb.2007.0228
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In this article, we studied the tag single-nucleotide polymorphism (tagSNP) selection problem on multiple populations using the pairwise r(2) linkage disequilibrium criterion. We proposed a novel combinatorial optimization model for the tagSNP selection problem, called the minimum common tagSNP selection (MCTS) problem, and presented efficient solutions for MCTS. Our approach consists of the following three main steps: (i) partitioning the SNP markers into small disjoint components, (ii) applying some data reduction rules to simplify the problem, and (iii) applying either a fast greedy algorithm or a Lagrangian relaxation algorithm to solve the remaining (general) MCTS. These algorithms also provide lower bounds on tagging (i. e., the minimum number of tagSNPs needed). The lower bounds allow us to evaluate how far our solution is from the optimum. To the best of our knowledge, it is the first time the tagging lower bounds are discussed in the literature. We assessed the performance of our algorithms on real HapMap data for genome-wide tagging. The experiments demonstrated that our algorithms run 3-4 orders of magnitude faster than the existing single-population tagging programs such as FESTA, LD-Select, and the multiple-population tagging method MultiPop-TagSelect. Our method also greatly reduced the required tagSNPs compared with LD-Select on a single population and MultiPop-TagSelect on multiple populations. Moreover, the numbers of tagSNPs selected by our algorithms are almost optimal because they are very close to the corresponding lower bounds obtained by our method.
引用
收藏
页码:21 / 37
页数:17
相关论文
共 50 条
  • [1] Genome-Wide Linkage Disequilibrium in Nine-Spined Stickleback Populations
    Yang, Ji
    Shikano, Takahito
    Li, Meng-Hua
    Merila, Juha
    [J]. G3-GENES GENOMES GENETICS, 2014, 4 (10): : 1919 - 1929
  • [2] A linkage disequilibrium-based statistical test for Genome-Wide Epistatic Selection Scans in structured populations
    Boyrie, Lea
    Moreau, Corentin
    Frugier, Florian
    Jacquet, Christophe
    Bonhomme, Maxime
    [J]. HEREDITY, 2021, 126 (01) : 77 - 91
  • [3] A linkage disequilibrium-based statistical test for Genome-Wide Epistatic Selection Scans in structured populations
    Léa Boyrie
    Corentin Moreau
    Florian Frugier
    Christophe Jacquet
    Maxime Bonhomme
    [J]. Heredity, 2021, 126 : 77 - 91
  • [4] Population parameters incorporated into genome-wide tagSNP selection
    Silesian, A. P.
    Szyda, J.
    [J]. ANIMAL, 2013, 7 (08) : 1227 - 1230
  • [5] Genome-wide comparisons of variation in linkage disequilibrium
    Teo, Yik Y.
    Fry, Andrew E.
    Bhattacharya, Kanishka
    Small, Kerrin S.
    Kwiatkowski, Dominic P.
    Clark, Taane G.
    [J]. GENOME RESEARCH, 2009, 19 (10) : 1849 - 1860
  • [6] Extensive genome-wide linkage disequilibrium in cattle
    Farnir, F
    Coppieters, W
    Arranz, JJ
    Berzi, P
    Cambisano, N
    Grisart, B
    Karim, L
    Marcq, F
    Moreau, L
    Mni, M
    Nezer, C
    Simon, P
    Vanmanshoven, P
    Wagenaar, D
    Georges, M
    [J]. GENOME RESEARCH, 2000, 10 (02) : 220 - 227
  • [7] An efficient comprehensive search algorithm for tagSNP selection using linkage disequilibrium criteria
    Qin, ZS
    Gopalakrishnan, S
    Abecasis, GR
    [J]. BIOINFORMATICS, 2006, 22 (02) : 220 - 225
  • [8] Linkage disequilibrium patterns of the human genome across populations
    Shifman, S
    Kuypers, J
    Kokoris, M
    Yakir, B
    Darvasi, A
    [J]. HUMAN MOLECULAR GENETICS, 2003, 12 (07) : 771 - 776
  • [9] FastTagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium
    Guimei Liu
    Yue Wang
    Limsoon Wong
    [J]. BMC Bioinformatics, 11
  • [10] Genome-wide linkage disequilibrium and genetic diversity in five populations of Australian domestic sheep
    Al-Mamun, Hawlader Abdullah
    Clark, Samuel A.
    Kwan, Paul
    Gondro, Cedric
    [J]. GENETICS SELECTION EVOLUTION, 2015, 47