STRScan: targeted profiling of short tandem repeats in whole-genome sequencing data

被引:10
|
作者
Tang, Haixu [1 ]
Nzabarushimana, Etienne [1 ]
机构
[1] Indiana Univ, Sch Informat & Comp, 150 S Woodlawn Ave, Bloomington, IN 47405 USA
来源
BMC BIOINFORMATICS | 2017年 / 18卷
基金
美国国家科学基金会;
关键词
Short tandem repeats; Whole-genome sequencing; Algorithm; DNA forensics; PERSONAL GENOMES; LOCI; MICROSATELLITES;
D O I
10.1186/s12859-017-1800-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Short tandem repeats (STRs) are found in many prokaryotic and eukaryotic genomes, and are commonly used as genetic markers, in particular for identity and parental testing in DNA forensics. The unstable expansion of some STRs was associated with various genetic disorders (e.g., the Huntington disease), and thus was used in genetic testing for screening individuals at high risk. Traditional STR analyses were based on the PCR amplification of STR loci followed by gel electrophoresis. With the availability of massive whole genome sequencing data, it becomes practical to mine STR profiles in silico from genome sequences. Software tools such as lobSTR and STR-FM have been developed to address these demands, which are, however, built upon whole genome reads mapping tools, and thus may not be sensitive enough. Results: In this paper, we present a standalone software tool STRScan that uses a greedy algorithm for targeted STR profiling in next-generation sequencing (NGS) data. STRScan was tested on the whole genome sequencing data from Venter genome sequencing and 1000 Genomes Project. The results showed that STRScan can profile 20% more STRs in the target set that are missed by lobSTR. Conclusion: STRScan is particularly useful for the NGS-based targeted STR profiling, e.g., in genetic and human identity testing. STRScan is available as open-source software at http://darwin.informatics.indiana.edu/str/.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] epiG: statistical inference and profiling of DNA methylation from whole-genome bisulfite sequencing data
    Martin Vincent
    Kamilla Mundbjerg
    Jakob Skou Pedersen
    Gangning Liang
    Peter A. Jones
    Torben Falck Ørntoft
    Karina Dalsgaard Sørensen
    Carsten Wiuf
    Genome Biology, 18
  • [32] Interpreting Whole-Genome Sequencing
    Grody, Wayne W.
    Vilain, Eric
    Nelson, Stanley F.
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2014, 312 (03): : 296 - 296
  • [33] Whole-genome sequencing in pharmacogeneticson
    Urban, Thomas J.
    PHARMACOGENOMICS, 2013, 14 (04) : 345 - 348
  • [34] Whole-genome DNA sequencing
    Myers, G
    COMPUTING IN SCIENCE & ENGINEERING, 1999, 1 (03) : 33 - 43
  • [35] Whole-genome sequencing of a spirochaete
    Cathy Holding
    Genome Biology, 4 (1)
  • [36] Whole-genome sequencing strategies
    Stein, Richard, 1600, Mary Ann Liebert Inc. (34):
  • [37] Recommend Whole-Genome Sequencing
    Dimmock, David
    NEW ENGLAND JOURNAL OF MEDICINE, 2014, 370 (25): : 2444 - 2445
  • [38] Whole-Genome Sequencing in Cancer
    Zhao, Eric Y.
    Jones, Martin
    Jones, Steven J. M.
    COLD SPRING HARBOR PERSPECTIVES IN MEDICINE, 2019, 9 (03):
  • [39] Whole-genome sequencing and the physician
    Thorogood, A.
    Knoppers, B. M.
    Dondorp, W. J.
    de Wert, G. M. W. R.
    CLINICAL GENETICS, 2012, 81 (06) : 511 - 513
  • [40] Clinical whole-genome sequencing
    Orli G. Bahcall
    Nature Reviews Genetics, 2015, 16 (7) : 377 - 377