STRScan: targeted profiling of short tandem repeats in whole-genome sequencing data

被引:10
|
作者
Tang, Haixu [1 ]
Nzabarushimana, Etienne [1 ]
机构
[1] Indiana Univ, Sch Informat & Comp, 150 S Woodlawn Ave, Bloomington, IN 47405 USA
来源
BMC BIOINFORMATICS | 2017年 / 18卷
基金
美国国家科学基金会;
关键词
Short tandem repeats; Whole-genome sequencing; Algorithm; DNA forensics; PERSONAL GENOMES; LOCI; MICROSATELLITES;
D O I
10.1186/s12859-017-1800-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Short tandem repeats (STRs) are found in many prokaryotic and eukaryotic genomes, and are commonly used as genetic markers, in particular for identity and parental testing in DNA forensics. The unstable expansion of some STRs was associated with various genetic disorders (e.g., the Huntington disease), and thus was used in genetic testing for screening individuals at high risk. Traditional STR analyses were based on the PCR amplification of STR loci followed by gel electrophoresis. With the availability of massive whole genome sequencing data, it becomes practical to mine STR profiles in silico from genome sequences. Software tools such as lobSTR and STR-FM have been developed to address these demands, which are, however, built upon whole genome reads mapping tools, and thus may not be sensitive enough. Results: In this paper, we present a standalone software tool STRScan that uses a greedy algorithm for targeted STR profiling in next-generation sequencing (NGS) data. STRScan was tested on the whole genome sequencing data from Venter genome sequencing and 1000 Genomes Project. The results showed that STRScan can profile 20% more STRs in the target set that are missed by lobSTR. Conclusion: STRScan is particularly useful for the NGS-based targeted STR profiling, e.g., in genetic and human identity testing. STRScan is available as open-source software at http://darwin.informatics.indiana.edu/str/.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Aspergillus Outbreak in an Intensive Care Unit: Source Analysis with Whole Genome Sequencing and Short Tandem Repeats
    Hiel, Stephan J. P.
    Hendriks, Amber C. A.
    Eijkenboom, Jos J. A.
    Bosch, Thijs
    Coolen, Jordy P. M.
    Melchers, Willem J. G.
    Anrochte, Paul
    Camps, Simone M. T.
    Verweij, Paul E.
    Zhang, Jianhua
    van Dommelen, Laura
    JOURNAL OF FUNGI, 2024, 10 (01)
  • [22] Potential of whole-genome sequencing-based pharmacogenetic profiling
    Caspar, Sylvan Manuel
    Schneider, Timo
    Stoll, Patricia
    Meienberg, Janine
    Matyas, Gabor
    PHARMACOGENOMICS, 2021, 22 (03) : 177 - 190
  • [23] Dot2dot: accurate whole-genome tandem repeats discovery
    Genovese, Loredana M.
    Mosca, Marco M.
    Pellegrini, Marco
    Geraci, Filippo
    BIOINFORMATICS, 2019, 35 (06) : 914 - 922
  • [24] High-depth whole-genome sequencing identifies structure variants, copy number variants and short tandem repeats associated with Parkinson's disease
    Wang, Chaodong
    Liu, Hankui
    Li, Xu-Ying
    Ma, Jinghong
    Gu, Zhuqin
    Feng, Xiuli
    Xie, Shu
    Tang, Bei-Sha
    Chen, Shengdi
    Wang, Wei
    Wang, Jian
    Zhang, Jianguo
    Chan, Piu
    NPJ PARKINSONS DISEASE, 2024, 10 (01)
  • [25] Systematic analysis of the involvement of DNA tandem repeats in Amyotrophic lateral sclerosisfrom Whole Genome Sequencing data
    Corrado, L.
    Genovese, L. M.
    Mangano, E.
    Di Pierro, A.
    Barizzone, N.
    Bordoni, R.
    Geraci, F.
    D'Aurizio, R.
    Croce, R.
    De MArchi, F.
    Mazzini, L.
    Cantello, R.
    De Bellis, G.
    Manzini, G.
    Severgnini, M.
    Pellegrini, M.
    D'Alfonso, S.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2019, 27 : 313 - 314
  • [26] Saturation analysis for whole-genome bisulfite sequencing data
    Emanuele Libertini
    Simon C Heath
    Rifat A Hamoudi
    Marta Gut
    Michael J Ziller
    Javier Herrero
    Agata Czyz
    Victor Ruotti
    Hendrik G Stunnenberg
    Mattia Frontini
    Willem H Ouwehand
    Alexander Meissner
    Ivo G Gut
    Stephan Beck
    Nature Biotechnology, 2016, 34 : 691 - 693
  • [27] Discovering missing heritability in whole-genome sequencing data
    Alexander I. Young
    Nature Genetics, 2022, 54 : 224 - 226
  • [28] Saturation analysis for whole-genome bisulfite sequencing data
    Libertini, Emanuele
    Heath, Simon C.
    Hamoudi, Rifat A.
    Gut, Marta
    Ziller, Michael J.
    Herrero, Javier
    Czyz, Agata
    Ruotti, Victor
    Stunnenberg, Hendrik G.
    Frontini, Mattia
    Ouwehand, Willem H.
    Meissner, Alexander
    Gut, Ivo G.
    Beck, Stephan
    NATURE BIOTECHNOLOGY, 2016, 34 (07) : 691 - 693
  • [29] Discovering missing heritability in whole-genome sequencing data
    Young, Alexander, I
    NATURE GENETICS, 2022, 54 (03) : 224 - 226
  • [30] epiG: statistical inference and profiling of DNA methylation from whole-genome bisulfite sequencing data
    Vincent, Martin
    Mundbjerg, Kamilla
    Pedersen, Jakob Skou
    Liang, Gangning
    Jones, Peter A.
    Orntoft, Torben Falck
    Sorensen, Karina Dalsgaard
    Wiuf, Carsten
    GENOME BIOLOGY, 2017, 18