Optimized quantification of intra-host viral diversity in SARS-CoV-2 and influenza virus sequence data

被引:9
|
作者
Roder, A. E. [1 ]
Johnson, K. E. E. [1 ,2 ]
Knoll, M. [2 ]
Khalfan, M. [2 ]
Wang, B. [2 ]
Schultz-Cherry, S. [3 ]
Banakis, S. [1 ]
Kreitman, A. [1 ]
Mederos, C. [1 ]
Youn, J. -H. [4 ]
Mercado, R. [4 ]
Wang, W. [1 ]
Chung, M. [1 ]
Ruchnewitz, D. [5 ]
Samanovic, M. I. [6 ]
Mulligan, M. J. [6 ]
Laessig, M. [5 ]
Luksza, M. [7 ]
Das, S. [4 ]
Gresham, D. [2 ]
Ghedin, E. [1 ,2 ]
机构
[1] NIAID, Syst Genom Sect, Lab Parasit Dis, DIR,NIH, Bethesda, MD 20892 USA
[2] NYU, Ctr Genom & Syst Biol, Dept Biol, New York, NY 10012 USA
[3] St Jude Childrens Res Hosp, Dept Infect Dis, Memphis, TN USA
[4] NIH, Dept Lab Med, Bethesda, MD USA
[5] Univ Cologne, Inst Biol Phys, Cologne, Germany
[6] NYU, Langone Vaccine Ctr, Dept Med, New York, NY USA
[7] Icahn Sch Med Mt Sinai, Dept Oncol Sci, New York, NY USA
来源
MBIO | 2023年 / 14卷 / 04期
关键词
SARS-CoV-2; influenza; genomics; bioinformatics; RNA; SELECTION; EVOLUTION; MUTATION; CANCER;
D O I
10.1128/mbio.01046-23
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
High error rates of viral RNA-dependent RNA polymerases lead to diverse intra-host viral populations during infection. Errors made during replication that are not strongly deleterious to the virus can lead to the generation of minority variants. However, accurate detection of minority variants in viral sequence data is complicated by errors introduced during sample preparation and data analysis. We used synthetic RNA controls and simulated data to test seven variant-calling tools across a range of allele frequencies and simulated coverages. We show that choice of variant caller and use of replicate sequencing have the most significant impact on single-nucleotide variant (SNV) discovery and demonstrate how both allele frequency and coverage thresholds impact both false discovery and false-negative rates. When replicates are not available, using a combination of multiple callers with more stringent cutoffs is recommended. We use these parameters to find minority variants in sequencing data from SARS-CoV-2 clinical specimens and provide guidance for studies of intra-host viral diversity using either single replicate data or data from technical replicates. Our study provides a framework for rigorous assessment of technical factors that impact SNV identification in viral samples and establishes heuristics that will inform and improve future studies of intra-host variation, viral diversity, and viral evolution. IMPORTANCEWhen viruses replicate inside a host cell, the virus replication machinery makes mistakes. Over time, these mistakes create mutations that result in a diverse population of viruses inside the host. Mutations that are neither lethal to the virus nor strongly beneficial can lead to minority variants that are minor members of the virus population. However, preparing samples for sequencing can also introduce errors that resemble minority variants, resulting in the inclusion of false-positive data if not filtered correctly. In this study, we aimed to determine the best methods for identification and quantification of these minority variants by testing the performance of seven commonly used variant-calling tools. We used simulated and synthetic data to test their performance against a true set of variants and then used these studies to inform variant identification in data from SARS-CoV-2 clinical specimens. Together, analyses of our data provide extensive guidance for future studies of viral diversity and evolution. When viruses replicate inside a host cell, the virus replication machinery makes mistakes. Over time, these mistakes create mutations that result in a diverse population of viruses inside the host. Mutations that are neither lethal to the virus nor strongly beneficial can lead to minority variants that are minor members of the virus population. However, preparing samples for sequencing can also introduce errors that resemble minority variants, resulting in the inclusion of false-positive data if not filtered correctly. In this study, we aimed to determine the best methods for identification and quantification of these minority variants by testing the performance of seven commonly used variant-calling tools. We used simulated and synthetic data to test their performance against a true set of variants and then used these studies to inform variant identification in data from SARS-CoV-2 clinical specimens. Together, analyses of our data provide extensive guidance for future studies of viral diversity and evolution.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] FORECASTING MUTATIONS OF CONCERN: A SURVEY OF SARS-COV-2 INTRA-HOST VIRAL DIVERSITY ACROSS THE NCBI SEQUENCE READ ARCHIVE DATA
    Conte, Matthew A.
    Thommana, Ashley
    Berry, Irina Maljkovic
    AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2021, 105 (05): : 193 - 193
  • [2] Intra-host genetic diversity of SARS-CoV-2 Omicron variants in children
    Liu, Pengcheng
    Cai, Jiehao
    Tian, He
    Li, Jingjing
    Lu, Lijuan
    Xu, Menghua
    Zhu, Xunhua
    Zhong, Huaqing
    Jia, Ran
    Fu, Xiaomin
    Wang, Xiangshi
    Ge, Yanling
    Zhu, Yanfeng
    Zeng, Mei
    Xu, Jin
    JOURNAL OF INFECTION, 2024, 88 (03)
  • [3] Intra-Host Diversity of SARS-Cov-2 Should Not Be Neglected: Case of the State of Victoria, Australia
    Armero, Alix
    Berthet, Nicolas
    Avarre, Jean-Christophe
    VIRUSES-BASEL, 2021, 13 (01):
  • [4] Intra-host evolution during SARS-CoV-2 prolonged infection
    Voloch, Carolina M.
    Francisco Jr, Ronaldo da Silva
    de Almeida, Luiz G. P.
    Brustolini, Otavio J.
    Cardoso, Cynthia C.
    Gerber, Alexandra L.
    Guimaraes, Ana Paula de C.
    Leitao, Isabela de Carvalho
    Mariani, Diana
    Ota, Victor Akira
    Lima, Cristiano X.
    Teixeira, Mauro M.
    Dias, Ana Carolina F.
    Galliez, Rafael Mello
    Faffe, Debora Souza
    Porto, Luis Cristovao
    Aguiar, Renato S.
    Castineira, Terezinha M. P. P.
    Ferreira, Orlando C.
    Tanuri, Amilcar
    de Vasconcelos, Ana Tereza R.
    VIRUS EVOLUTION, 2021, 7 (02)
  • [5] Intra-Host Evolution Analyses in an Immunosuppressed Patient Supports SARS-CoV-2 Viral Reservoir Hypothesis
    Fournelle, Dominique
    Mostefai, Fatima
    Brunet-Ratnasingham, Elsa
    Poujol, Raphael
    Grenier, Jean-Christophe
    Galvez, Jose Hector
    Pagliuzza, Amelie
    Levade, Ines
    Moreira, Sandrine
    Benlarbi, Mehdi
    Beaudoin-Bussieres, Guillaume
    Gendron-Lepage, Gabrielle
    Bourassa, Catherine
    Tauzin, Alexandra
    Grandjean Lapierre, Simon
    Chomont, Nicolas
    Finzi, Andres
    Kaufmann, Daniel E.
    Craig, Morgan
    Hussin, Julie G.
    VIRUSES-BASEL, 2024, 16 (03):
  • [6] Structural insights in cell-type specific evolution of intra-host diversity by SARS-CoV-2
    Kapil Gupta
    Christine Toelzer
    Maia Kavanagh Williamson
    Deborah K. Shoemark
    A. Sofia F. Oliveira
    David A. Matthews
    Abdulaziz Almuqrin
    Oskar Staufer
    Sathish K. N. Yadav
    Ufuk Borucu
    Frederic Garzoni
    Daniel Fitzgerald
    Joachim Spatz
    Adrian J. Mulholland
    Andrew D. Davidson
    Christiane Schaffitzel
    Imre Berger
    Nature Communications, 13
  • [7] Structural insights in cell-type specific evolution of intra-host diversity by SARS-CoV-2
    Gupta, Kapil
    Toelzer, Christine
    Williamson, Maia Kavanagh
    Shoemark, Deborah K.
    Oliveira, A. Sofia F.
    Matthews, David A.
    Almuqrin, Abdulaziz
    Staufer, Oskar
    Yadav, Sathish K. N.
    Borucu, Ufuk
    Garzoni, Frederic
    Fitzgerald, Daniel
    Spatz, Joachim
    Mulholland, Adrian J.
    Davidson, Andrew D.
    Schaffitzel, Christiane
    Berger, Imre
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [8] SARS-CoV-2 spike protein diversity at an intra-host level, among SARS-CoV-2 infected individuals in South Africa, 2020 to 2022
    Subramoney, Kathleen
    Mtileni, Nkhensani
    Davis, Ashlyn
    Giandhari, Jennifer
    Tegally, Houriiyah
    Wilkinson, Eduan
    Naidoo, Yeshnee
    Ramphal, Yajna
    Pillay, Sureshnee
    Ramphal, Upasana
    Simane, Andiswa
    Reddy, Bhaveshan
    Mashishi, Bonolo
    Mbenenge, Nonhlanhla
    de Oliveira, Tulio
    Fielding, Burtram C.
    Treurnicht, Florette K.
    PLOS ONE, 2023, 18 (05):
  • [9] Impact of Vaccination on Intra-Host Genetic Diversity of Patients Infected with SARS-CoV-2 Gamma Lineage
    Marques, Beatriz de Carvalho
    Banho, Cecilia Artico
    Sacchetto, Livia
    Negri, Andreia
    Vasilakis, Nikos
    Nogueira, Mauricio Lacerda
    VIRUSES-BASEL, 2024, 16 (10):
  • [10] Intra-Host Evolution Provides for the Continuous Emergence of SARS-CoV-2 Variants
    Landis, Justin T.
    Moorad, Razia
    Pluta, Linda J.
    Caro-Vegas, Carolina
    McNamara, Ryan P.
    Eason, Anthony B.
    Bailey, Aubrey
    Villamor, Femi Cleola S.
    Juarez, Angelica
    Wong, Jason P.
    Yang, Brian
    Broussard, Grant S.
    Damania, Blossom
    Dittmer, Dirk P.
    MBIO, 2023, 14 (02)