Reconstruction of viral population structure from next-generation sequencing data using multicommodity flows

被引:18
|
作者
Skums, Pavel [1 ]
Mancuso, Nicholas [2 ]
Artyomenko, Alexander [2 ]
Tork, Bassam [2 ]
Mandoiu, Ion [3 ]
Khudyakov, Yury [1 ]
Zelikovsky, Alex [2 ]
机构
[1] Ctr Dis Control & Prevent, Lab Mol Epidemiol & Bioinformat, Div Viral Hepatitis, Atlanta, GA 30333 USA
[2] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA
[3] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA
来源
BMC BIOINFORMATICS | 2013年 / 14卷
基金
美国食品与农业研究所; 美国国家科学基金会;
关键词
PS and YK were supported intramurally by Centers for Disease Control and Prevention. NM; BT; IM and AZ were supported in part by Agriculture and Food Research Initiative Competitive Grant no. 201167016-30331 from the USDA National Institute of Food and Agriculture and by Life Technology Grant Viral Metagenome Reconstruction Software for Ion Torrent PGM Sequencer. NM; AA; BT and AZ were supported in part by NSF award IIS-0916401. IM was supported in part by NSF award IIS-0916948. NM and BT were supported in part by Molecular Basis of Disease Fellowship; Georgia State University. Authors thank referees for valuable comments which helped to significantly improve the paper;
D O I
10.1186/1471-2105-14-S9-S2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Highly mutable RNA viruses exist in infected hosts as heterogeneous populations of genetically close variants known as quasispecies. Next-generation sequencing (NGS) allows for analysing a large number of viral sequences from infected patients, presenting a novel opportunity for studying the structure of a viral population and understanding virus evolution, drug resistance and immune escape. Accurate reconstruction of genetic composition of intra-host viral populations involves assembling the NGS short reads into whole-genome sequences and estimating frequencies of individual viral variants. Although a few approaches were developed for this task, accurate reconstruction of quasispecies populations remains greatly unresolved. Results: Two new methods, AmpMCF and ShotMCF, for reconstruction of the whole-genome intra-host viral variants and estimation of their frequencies were developed, based on Multicommodity Flows (MCFs). AmpMCF was designed for NGS reads obtained from individual PCR amplicons and ShotMCF for NGS shotgun reads. While AmpMCF, based on covering formulation, identifies a minimal set of quasispecies explaining all observed reads, ShotMCS, based on packing formulation, engages the maximal number of reads to generate the most probable set of quasispecies. Both methods were evaluated on simulated data in comparison to Maximum Bandwidth and ViSpA, previously developed state-of-the-art algorithms for estimating quasispecies spectra from the NGS amplicon and shotgun reads, respectively. Both algorithms were accurate in estimation of quasispecies frequencies, especially from large datasets. Conclusions: The problem of viral population reconstruction from amplicon or shotgun NGS reads was solved using the MCF formulation. The two methods, ShotMCF and AmpMCF, developed here afford accurate reconstruction of the structure of intra-host viral population from NGS reads. The implementations of the algorithms are available at http://alan.cs.gsu.edu/vira.html (AmpMCF) and http://alan.cs.gsu.edu/NGS/?q=content/shotmcf (ShotMCF).
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Next-Generation Anchor Based Phylogeny (NexABP): Constructing phylogeny from Next-generation sequencing data
    Tanmoy Roychowdhury
    Anchal Vishnoi
    Alok Bhattacharya
    Scientific Reports, 3
  • [32] Next-Generation Anchor Based Phylogeny (NexABP): Constructing phylogeny from Next-generation sequencing data
    Roychowdhury, Tanmoy
    Vishnoi, Anchal
    Bhattacharya, Alok
    SCIENTIFIC REPORTS, 2013, 3
  • [33] Learning Ecological Networks from Next-Generation Sequencing Data
    Vacher, Corinne
    Tamaddoni-Nezhad, Alireza
    Kamenova, Stefaniya
    Peyrard, Nathalie
    Moalic, Yann
    Sabbadin, Regis
    Schwaller, Loic
    Chiquet, Julien
    Smith, M. Alex
    Vallance, Jessica
    Fievet, Virgil
    Jakuschkin, Boris
    Bohan, David A.
    ECOSYSTEM SERVICES: FROM BIODIVERSITY TO SOCIETY, PT 2, 2016, 54 : 1 - 39
  • [34] Genotype and SNP calling from next-generation sequencing data
    Rasmus Nielsen
    Joshua S. Paul
    Anders Albrechtsen
    Yun S. Song
    Nature Reviews Genetics, 2011, 12 : 443 - 451
  • [35] Population structure of the swordfish, Xiphias gladius, across the Indian Ocean using next-generation sequencing
    Chevrier, Thomas
    Cowart, Dominique A.
    Nieblas, Anne-Elise
    Charrier, Gregory
    Bernard, Serge
    Evano, Hugues
    Brisset, Blandine
    Chanut, Jeremie
    Bonhommeau, Sylvain
    ICES JOURNAL OF MARINE SCIENCE, 2024,
  • [36] Genotype and SNP calling from next-generation sequencing data
    Nielsen, Rasmus
    Paul, Joshua S.
    Albrechtsen, Anders
    Song, Yun S.
    NATURE REVIEWS GENETICS, 2011, 12 (06) : 443 - 451
  • [37] Mutational analysis of CFTR in the Ecuadorian population using next-generation sequencing
    Carlos Ruiz-Cabezas, Juan
    Barros, Francisco
    Sobrino, Beatriz
    Garcia, Gustavo
    Burgos, Ramiro
    Farhat, Carlos
    Castro, Antonella
    Munoz, Lenin
    Karina Zambrano, Ana
    Martinez, Mariela
    Montalvan, Martha
    Paz-y-Mino, Cesar
    GENE, 2019, 696 : 28 - 32
  • [38] Pathway analysis with next-generation sequencing data
    Jinying Zhao
    Yun Zhu
    Eric Boerwinkle
    Momiao Xiong
    European Journal of Human Genetics, 2015, 23 : 507 - 515
  • [39] Identification of indels in next-generation sequencing data
    Ratan, Aakrosh
    Olson, Thomas L.
    Loughran, Thomas P., Jr.
    Miller, Webb
    BMC BIOINFORMATICS, 2015, 16
  • [40] Visualizing next-generation sequencing data with JBrowse
    Westesson, Oscar
    Skinner, Mitchell
    Holmes, Ian
    BRIEFINGS IN BIOINFORMATICS, 2013, 14 (02) : 172 - 177