Efficient detection of viral transmissions with Next-Generation Sequencing data

被引:9
|
作者
Rytsareva, Inna [1 ]
Campo, David S. [1 ]
Zheng, Yueli [1 ]
Sims, Seth [1 ]
Thankachan, Sharma V. [2 ,3 ]
Tetik, Cansu [2 ]
Chirag, Jain [2 ]
Chockalingam, Sriram P. [4 ]
Sue, Amanda [1 ]
Aluru, Srinivas [2 ,4 ]
Khudyakov, Yury [1 ]
机构
[1] Ctr Dis Control & Prevent, Div Viral Hepatitis, Mol Epidemiol & Bioinformat, Atlanta, GA 30333 USA
[2] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA
[3] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
[4] Georgia Inst Technol, Inst Data Engn & Sci, Atlanta, GA 30332 USA
来源
BMC GENOMICS | 2017年 / 18卷
关键词
C VIRUS OUTBREAK; MOLECULAR EPIDEMIOLOGY; HEPATITIS;
D O I
10.1186/s12864-017-3732-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Hepatitis C is a major public health problem in the United States and worldwide. Outbreaks of hepatitis C virus (HCV) infections associated with unsafe injection practices, drug diversion, and other exposures to blood are difficult to detect and investigate. Molecular analysis has been frequently used in the study of HCV outbreaks and transmission chains; helping identify a cluster of sequences as linked by transmission if their genetic distances are below a previously defined threshold. However, HCV exists as a population of numerous variants in each infected individual and it has been observed that minority variants in the source are often the ones responsible for transmission, a situation that precludes the use of a single sequence per individual because many such transmissions would be missed. The use of Next-Generation Sequencing immensely increases the sensitivity of transmission detection but brings a considerable computational challenge because all sequences need to be compared among all pairs of samples. Methods: We developed a three-step strategy that filters pairs of samples according to different criteria: (i) a k-mer bloom filter, (ii) a Levenhstein filter and (iii) a filter of identical sequences. We applied these three filters on a set of samples that cover the spectrum of genetic relationships among HCV cases, from being part of the same transmission cluster, to belonging to different subtypes. Results: Our three-step filtering strategy rapidly removes 85.1% of all the pairwise sample comparisons and 91.0% of all pairwise sequence comparisons, accurately establishing which pairs of HCV samples are below the relatedness threshold. Conclusions: We present a fast and efficient three-step filtering strategy that removes most sequence comparisons and accurately establishes transmission links of any threshold-based method. This highly efficient workflow will allow a faster response and molecular detection capacity, improving the rate of detection of viral transmissions with molecular data.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Efficient detection of viral transmissions with Next-Generation Sequencing data
    Inna Rytsareva
    David S. Campo
    Yueli Zheng
    Seth Sims
    Sharma V. Thankachan
    Cansu Tetik
    Jain Chirag
    Sriram P. Chockalingam
    Amanda Sue
    Srinivas Aluru
    Yury Khudyakov
    [J]. BMC Genomics, 18
  • [2] Efficient error correction for next-generation sequencing of viral amplicons
    Pavel Skums
    Zoya Dimitrova
    David S Campo
    Gilberto Vaughan
    Livia Rossi
    Joseph C Forbi
    Jonny Yokosawa
    Alex Zelikovsky
    Yury Khudyakov
    [J]. BMC Bioinformatics, 13
  • [3] Efficient error correction for next-generation sequencing of viral amplicons
    Skums, Pavel
    Dimitrova, Zoya
    Campo, David S.
    Vaughan, Gilberto
    Rossi, Livia
    Forbi, Joseph C.
    Yokosawa, Jonny
    Zelikovsky, Alex
    Khudyakov, Yury
    [J]. BMC BIOINFORMATICS, 2012, 13
  • [4] SeedsGraph: an efficient assembler for next-generation sequencing data
    Wang, Chunyu
    Guo, Maozu
    Liu, Xiaoyan
    Liu, Yang
    Zou, Quan
    [J]. BMC MEDICAL GENOMICS, 2015, 8
  • [5] SeedsGraph: an efficient assembler for next-generation sequencing data
    Chunyu Wang
    Maozu Guo
    Xiaoyan Liu
    Yang Liu
    Quan Zou
    [J]. BMC Medical Genomics, 8
  • [6] NGSNGS: next-generation simulator for next-generation sequencing data
    Henriksen, Rasmus Amund
    Zhao, Lei
    Korneliussen, Thorfinn Sand
    [J]. BIOINFORMATICS, 2023, 39 (01)
  • [7] Detection of viral pathogens in high grade gliomas from unmapped next-generation sequencing data
    Cimino, Patrick J.
    Zhao, Guoyan
    Wang, David
    Sehn, Jennifer K.
    Lewis, James S., Jr.
    Duncavage, Eric J.
    [J]. EXPERIMENTAL AND MOLECULAR PATHOLOGY, 2014, 96 (03) : 310 - 315
  • [8] Epidemiological data analysis of viral quasispecies in the next-generation sequencing era
    Knyazev, Sergey
    Hughes, Lauren
    Skums, Pavel
    Zelikovsky, Alexander
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (01) : 96 - 108
  • [9] Estimating Fitness of Viral Quasispecies from Next-Generation Sequencing Data
    Seifert, David
    Beerenwinkel, Niko
    [J]. QUASISPECIES: FROM THEORY TO EXPERIMENTAL SYSTEMS, 2016, 392 : 181 - 200
  • [10] Short tandem repeat detection in next-generation sequencing data
    Depreeuw, Jeroen
    Souche, Erika
    Allemeersch, Joke
    Bossuyt, Wouter
    Claeys, Kristl
    Van Damme, Philippe
    Van Daele, Sien
    De Waele, Liesbeth
    Goemans, Nathalie
    Orbitus, Els
    Ballon, Katleen
    Van Esch, Hilde
    Matthijs, Gert
    Vermeer, Sascha
    Race, Valerie
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2022, 30 (SUPPL 1) : 495 - 495