Reconstruction of viral population structure from next-generation sequencing data using multicommodity flows

被引:18
|
作者
Skums, Pavel [1 ]
Mancuso, Nicholas [2 ]
Artyomenko, Alexander [2 ]
Tork, Bassam [2 ]
Mandoiu, Ion [3 ]
Khudyakov, Yury [1 ]
Zelikovsky, Alex [2 ]
机构
[1] Ctr Dis Control & Prevent, Lab Mol Epidemiol & Bioinformat, Div Viral Hepatitis, Atlanta, GA 30333 USA
[2] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA
[3] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA
来源
BMC BIOINFORMATICS | 2013年 / 14卷
基金
美国食品与农业研究所; 美国国家科学基金会;
关键词
PS and YK were supported intramurally by Centers for Disease Control and Prevention. NM; BT; IM and AZ were supported in part by Agriculture and Food Research Initiative Competitive Grant no. 201167016-30331 from the USDA National Institute of Food and Agriculture and by Life Technology Grant Viral Metagenome Reconstruction Software for Ion Torrent PGM Sequencer. NM; AA; BT and AZ were supported in part by NSF award IIS-0916401. IM was supported in part by NSF award IIS-0916948. NM and BT were supported in part by Molecular Basis of Disease Fellowship; Georgia State University. Authors thank referees for valuable comments which helped to significantly improve the paper;
D O I
10.1186/1471-2105-14-S9-S2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Highly mutable RNA viruses exist in infected hosts as heterogeneous populations of genetically close variants known as quasispecies. Next-generation sequencing (NGS) allows for analysing a large number of viral sequences from infected patients, presenting a novel opportunity for studying the structure of a viral population and understanding virus evolution, drug resistance and immune escape. Accurate reconstruction of genetic composition of intra-host viral populations involves assembling the NGS short reads into whole-genome sequences and estimating frequencies of individual viral variants. Although a few approaches were developed for this task, accurate reconstruction of quasispecies populations remains greatly unresolved. Results: Two new methods, AmpMCF and ShotMCF, for reconstruction of the whole-genome intra-host viral variants and estimation of their frequencies were developed, based on Multicommodity Flows (MCFs). AmpMCF was designed for NGS reads obtained from individual PCR amplicons and ShotMCF for NGS shotgun reads. While AmpMCF, based on covering formulation, identifies a minimal set of quasispecies explaining all observed reads, ShotMCS, based on packing formulation, engages the maximal number of reads to generate the most probable set of quasispecies. Both methods were evaluated on simulated data in comparison to Maximum Bandwidth and ViSpA, previously developed state-of-the-art algorithms for estimating quasispecies spectra from the NGS amplicon and shotgun reads, respectively. Both algorithms were accurate in estimation of quasispecies frequencies, especially from large datasets. Conclusions: The problem of viral population reconstruction from amplicon or shotgun NGS reads was solved using the MCF formulation. The two methods, ShotMCF and AmpMCF, developed here afford accurate reconstruction of the structure of intra-host viral population from NGS reads. The implementations of the algorithms are available at http://alan.cs.gsu.edu/vira.html (AmpMCF) and http://alan.cs.gsu.edu/NGS/?q=content/shotmcf (ShotMCF).
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Reconstruction of viral population structure from next-generation sequencing data using multicommodity flows
    Pavel Skums
    Nicholas Mancuso
    Alexander Artyomenko
    Bassam Tork
    Ion Mandoiu
    Yury Khudyakov
    Alex Zelikovsky
    BMC Bioinformatics, 14
  • [2] QuRe: software for viral quasispecies reconstruction from next-generation sequencing data
    Prosperi, Mattia C. F.
    Salemi, Marco
    BIOINFORMATICS, 2012, 28 (01) : 132 - 133
  • [3] Population structure and selection analysis from Cameroon next-generation sequencing data
    Olaechea-Lazaro, Sonia
    Izagirre, Neskuts
    Lopez, Saioa
    Garcia, Oscar
    Veeramah, Krishna R.
    Hellenthal, Garrett
    Thomas, Mark
    Miguel Lorenzo-Salazar, Jose
    Flores, Carlos
    Alonso, Santos
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2022, 30 (SUPPL 1) : 522 - 522
  • [4] Robust inference of population structure from next-generation sequencing data with systematic differences in sequencing
    Liao, Peizhou
    Satten, Glen A.
    Hu, Yi-Juan
    BIOINFORMATICS, 2018, 34 (07) : 1157 - 1163
  • [5] Using population data for assessing next-generation sequencing performance
    Houniet, Darren T.
    Rahman, Thahira J.
    Al Turki, Saeed
    Hurles, Matthew E.
    Xu, Yaobo
    Goodship, Judith
    Keavney, Bernard
    Santibanez Koref, Mauro
    BIOINFORMATICS, 2015, 31 (01) : 56 - 61
  • [6] Estimating Fitness of Viral Quasispecies from Next-Generation Sequencing Data
    Seifert, David
    Beerenwinkel, Niko
    QUASISPECIES: FROM THEORY TO EXPERIMENTAL SYSTEMS, 2016, 392 : 181 - 200
  • [7] Quantifying Population Genetic Differentiation from Next-Generation Sequencing Data
    Fumagalli, Matteo
    Vieira, Filipe G.
    Korneliussen, Thorfinn Sand
    Linderoth, Tyler
    Huerta-Sanchez, Emilia
    Albrechtsen, Anders
    Nielsen, Rasmus
    GENETICS, 2013, 195 (03) : 979 - +
  • [8] Efficient detection of viral transmissions with Next-Generation Sequencing data
    Inna Rytsareva
    David S. Campo
    Yueli Zheng
    Seth Sims
    Sharma V. Thankachan
    Cansu Tetik
    Jain Chirag
    Sriram P. Chockalingam
    Amanda Sue
    Srinivas Aluru
    Yury Khudyakov
    BMC Genomics, 18
  • [9] Efficient detection of viral transmissions with Next-Generation Sequencing data
    Rytsareva, Inna
    Campo, David S.
    Zheng, Yueli
    Sims, Seth
    Thankachan, Sharma V.
    Tetik, Cansu
    Chirag, Jain
    Chockalingam, Sriram P.
    Sue, Amanda
    Aluru, Srinivas
    Khudyakov, Yury
    BMC GENOMICS, 2017, 18
  • [10] NGSNGS: next-generation simulator for next-generation sequencing data
    Henriksen, Rasmus Amund
    Zhao, Lei
    Korneliussen, Thorfinn Sand
    BIOINFORMATICS, 2023, 39 (01)