A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees

被引:43
|
作者
Silberstein, Mark [1 ,2 ]
Weissbrod, Omer [2 ]
Otten, Lars [3 ]
Tzemach, Anna [2 ]
Anisenia, Andrei [4 ]
Shtark, Oren
Tuberg, Dvir [2 ]
Galfrin, Eddie [2 ]
Gannon, Irena [2 ]
Shalata, Adel [5 ,6 ,7 ]
Borochowitz, Zvi U. [5 ,8 ,9 ]
Dechter, Rina [3 ]
Thompson, Elizabeth [10 ]
Geiger, Dan [2 ]
机构
[1] Technion Israel Inst Technol, Dept Comp Sci, IL-32000 Haifa, Israel
[2] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[3] UC Irvine, Donald Bren Sch Informat & Comp Sci, Irvine, CA 92697 USA
[4] Univ Ottawa, Dept Comp Sci, Ottawa, ON K1S 0S1, Canada
[5] Bnai Zion Med Ctr, Simon Winter Inst Human Genet, IL-31048 Haifa, Israel
[6] Galilee Soc, Ctr Res & Dev, IL-20200 Shefa Amr, Israel
[7] Holy Family Hosp, IL-16100 Nazareth, Israel
[8] Technion Israel Inst Technol, Rappaport Fac Med, IL-32000 Haifa, Israel
[9] Technion Israel Inst Technol, Res Inst, IL-32000 Haifa, Israel
[10] Univ Washington, Dept Stat, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
MULTIPOINT LINKAGE; CUTTING LARGE; DISEQUILIBRIUM; MAPS; TOOL; GENERATION; MODEL; COMPUTATION; LIKELIHOOD; SELECTION;
D O I
10.1093/bioinformatics/bts658
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The use of dense single nucleotide polymorphism (SNP) data in genetic linkage analysis of large pedigrees is impeded by significant technical, methodological and computational challenges. Here we describe Superlink-Online SNP, a new powerful online system that streamlines the linkage analysis of SNP data. It features a fully integrated flexible processing workflow comprising both well-known and novel data analysis tools, including SNP clustering, erroneous data filtering, exact and approximate LOD calculations and maximum-likelihood haplotyping. The system draws its power from thousands of CPUs, performing data analysis tasks orders of magnitude faster than a single computer. By providing an intuitive interface to sophisticated state-of-the-art analysis tools coupled with high computing capacity, Superlink-Online SNP helps geneticists unleash the potential of SNP data for detecting disease genes. Results: Computations performed by Superlink-Online SNP are automatically parallelized using novel paradigms, and executed on unlimited number of private or public CPUs. One novel service is large-scale approximate Markov Chain-Monte Carlo (MCMC) analysis. The accuracy of the results is reliably estimated by running the same computation on multiple CPUs and evaluating the Gelman-Rubin Score to set aside unreliable results. Another service within the workflow is a novel parallelized exact algorithm for inferring maximum-likelihood haplotyping. The reported system enables genetic analyses that were previously infeasible. We demonstrate the system capabilities through a study of a large complex pedigree affected with metabolic syndrome.
引用
收藏
页码:197 / 205
页数:9
相关论文
共 50 条
  • [41] LINKAGE ANALYSIS OF CARDIOVASCULAR-DISEASE RISK-FACTORS IN 3 LARGE PEDIGREES
    WEISSBECKER, KA
    BERENSON, GS
    WILSON, AF
    ELSTON, RC
    AMERICAN JOURNAL OF HUMAN GENETICS, 1993, 53 (03) : 878 - 878
  • [42] A distributed system for genetic linkage analysis
    Silberstein, Mark
    Geiger, Dan
    Schuster, Assaf
    DISTRIBUTED, HIGH-PERFORMANCE AND GRID COMPUTING IN COMPUTATIONAL BIOLOGY, PROCEEDINGS, 2007, 4360 : 110 - +
  • [43] Methods for dense SNP marker analysis for genetic linkage analysis with application to rheumatoid arthritis
    Amos, C. I.
    Chen, W. V.
    Peng, B.
    Liu, X.
    Zhu, D.
    Shete, S.
    Siminovitch, K.
    GENETIC EPIDEMIOLOGY, 2007, 31 (06) : 605 - 605
  • [44] Pairwise shared genomic segment analysis in high-risk pedigrees: application to Genetic Analysis Workshop 17 exome-sequencing SNP data
    Zheng Cai
    Stacey Knight
    Alun Thomas
    Nicola J Camp
    BMC Proceedings, 5 (Suppl 9)
  • [45] Finding starting points for Markov chain Monte Carlo analysis of genetic data from large and complex pedigrees
    Luo, YQ
    Lin, SL
    GENETIC EPIDEMIOLOGY, 2003, 25 (01) : 14 - 24
  • [46] MANAGING DATA FOR GENETIC-LINKAGE ANALYSIS
    FARRER, LA
    AMERICAN JOURNAL OF HUMAN GENETICS, 1986, 39 (01) : 146 - 147
  • [47] Population genetic analysis of ascertained SNP data.
    Nielsen R.
    Human Genomics, 1 (3) : 218 - 224
  • [48] RAGWEED SENSITIVITY - AN ANALYSIS OF 5 LARGE PEDIGREES - LINKAGE WITH HLAB, GLO AND PGM3
    MENDELL, NR
    BLUMENTHAL, MN
    AMOS, DB
    YUNIS, EJ
    DICKSTEIN, R
    ELSTON, RC
    CYTOGENETICS AND CELL GENETICS, 1979, 25 (1-4): : 184 - 184
  • [49] Genetic linkage analysis of a large family with photoparoxysmal response
    Lo, Chiening
    Shorvon, Simon
    Davis, Mary
    Houlden, Henry
    Gibbons, Vaneesha
    Wood, Nick
    EPILEPSY RESEARCH, 2012, 99 (1-2) : 38 - 45
  • [50] An exploration of heterogeneity in genetic analysis of complex pedigrees: linkage and association using whole genome sequencing data in the MAP4 region
    Shelley B Bull
    Zhijian Chen
    Kuan-Rui Tan
    Julia Poirier
    BMC Proceedings, 8 (Suppl 1)