Optimized quantification of intra-host viral diversity in SARS-CoV-2 and influenza virus sequence data

被引:9
|
作者
Roder, A. E. [1 ]
Johnson, K. E. E. [1 ,2 ]
Knoll, M. [2 ]
Khalfan, M. [2 ]
Wang, B. [2 ]
Schultz-Cherry, S. [3 ]
Banakis, S. [1 ]
Kreitman, A. [1 ]
Mederos, C. [1 ]
Youn, J. -H. [4 ]
Mercado, R. [4 ]
Wang, W. [1 ]
Chung, M. [1 ]
Ruchnewitz, D. [5 ]
Samanovic, M. I. [6 ]
Mulligan, M. J. [6 ]
Laessig, M. [5 ]
Luksza, M. [7 ]
Das, S. [4 ]
Gresham, D. [2 ]
Ghedin, E. [1 ,2 ]
机构
[1] NIAID, Syst Genom Sect, Lab Parasit Dis, DIR,NIH, Bethesda, MD 20892 USA
[2] NYU, Ctr Genom & Syst Biol, Dept Biol, New York, NY 10012 USA
[3] St Jude Childrens Res Hosp, Dept Infect Dis, Memphis, TN USA
[4] NIH, Dept Lab Med, Bethesda, MD USA
[5] Univ Cologne, Inst Biol Phys, Cologne, Germany
[6] NYU, Langone Vaccine Ctr, Dept Med, New York, NY USA
[7] Icahn Sch Med Mt Sinai, Dept Oncol Sci, New York, NY USA
来源
MBIO | 2023年 / 14卷 / 04期
关键词
SARS-CoV-2; influenza; genomics; bioinformatics; RNA; SELECTION; EVOLUTION; MUTATION; CANCER;
D O I
10.1128/mbio.01046-23
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
High error rates of viral RNA-dependent RNA polymerases lead to diverse intra-host viral populations during infection. Errors made during replication that are not strongly deleterious to the virus can lead to the generation of minority variants. However, accurate detection of minority variants in viral sequence data is complicated by errors introduced during sample preparation and data analysis. We used synthetic RNA controls and simulated data to test seven variant-calling tools across a range of allele frequencies and simulated coverages. We show that choice of variant caller and use of replicate sequencing have the most significant impact on single-nucleotide variant (SNV) discovery and demonstrate how both allele frequency and coverage thresholds impact both false discovery and false-negative rates. When replicates are not available, using a combination of multiple callers with more stringent cutoffs is recommended. We use these parameters to find minority variants in sequencing data from SARS-CoV-2 clinical specimens and provide guidance for studies of intra-host viral diversity using either single replicate data or data from technical replicates. Our study provides a framework for rigorous assessment of technical factors that impact SNV identification in viral samples and establishes heuristics that will inform and improve future studies of intra-host variation, viral diversity, and viral evolution. IMPORTANCEWhen viruses replicate inside a host cell, the virus replication machinery makes mistakes. Over time, these mistakes create mutations that result in a diverse population of viruses inside the host. Mutations that are neither lethal to the virus nor strongly beneficial can lead to minority variants that are minor members of the virus population. However, preparing samples for sequencing can also introduce errors that resemble minority variants, resulting in the inclusion of false-positive data if not filtered correctly. In this study, we aimed to determine the best methods for identification and quantification of these minority variants by testing the performance of seven commonly used variant-calling tools. We used simulated and synthetic data to test their performance against a true set of variants and then used these studies to inform variant identification in data from SARS-CoV-2 clinical specimens. Together, analyses of our data provide extensive guidance for future studies of viral diversity and evolution. When viruses replicate inside a host cell, the virus replication machinery makes mistakes. Over time, these mistakes create mutations that result in a diverse population of viruses inside the host. Mutations that are neither lethal to the virus nor strongly beneficial can lead to minority variants that are minor members of the virus population. However, preparing samples for sequencing can also introduce errors that resemble minority variants, resulting in the inclusion of false-positive data if not filtered correctly. In this study, we aimed to determine the best methods for identification and quantification of these minority variants by testing the performance of seven commonly used variant-calling tools. We used simulated and synthetic data to test their performance against a true set of variants and then used these studies to inform variant identification in data from SARS-CoV-2 clinical specimens. Together, analyses of our data provide extensive guidance for future studies of viral diversity and evolution.
引用
收藏
页数:17
相关论文
共 50 条
  • [11] Multi-Organ Spread and Intra-Host Diversity of SARS-CoV-2 Support Viral Persistence, Adaptation, and a Mechanism That Increases Evolvability
    Manrique, Julieta M.
    Maffia-Bizzozero, Santiago
    Delpino, M. Victoria
    Quarleri, Jorge
    Jones, Leandro R.
    JOURNAL OF MEDICAL VIROLOGY, 2024, 96 (12)
  • [12] Atypical Prolonged Viral Shedding With Intra-Host SARS-CoV-2 Evolution in a Mildly Affected Symptomatic Patient
    Cunha, Marielton dos Passos
    Vilela, Ana Paula Pessoa
    Molina, Camila Vieira
    Acuna, Stephanie Maia
    Muxel, Sandra Marcia
    Barroso, Vinicius de Morais
    Baroni, Sabrina
    Gomes de Oliveira, Lilian
    Angelo, Yan de Souza
    Peron, Jean Pierre Schatzmann
    Goes, Luiz Gustavo Bentim
    Campos, Angelica Cristine de Almeida
    Minoprio, Paola
    FRONTIERS IN MEDICINE, 2021, 8
  • [13] Refining SARS-CoV-2 intra-host variation by leveraging large-scale sequencing data
    Mostefai, Fatima
    Grenier, Jean-Christophe
    Poujol, Raphael
    Hussin, Julie
    NAR GENOMICS AND BIOINFORMATICS, 2024, 6 (04)
  • [14] Host Viral Load During Triple Coinfection of SARS-CoV-2, Influenza Virus, and Syncytial Virus
    Taye, Mesfin Asfaw
    CONTEMPORARY MATHEMATICS, 2023, 4 (03): : 392 - 410
  • [15] Two-step fitness selection for intra-host variations in SARS-CoV-2
    Li, Jiarui
    Du, Pengcheng
    Yang, Lijiang
    Zhang, Ju
    Song, Chuan
    Chen, Danying
    Song, Yangzi
    Ding, Nan
    Hua, Mingxi
    Han, Kai
    Song, Rui
    Xie, Wen
    Chen, Zhihai
    Wang, Xianbo
    Liu, Jingyuan
    Xu, Yanli
    Gao, Guiju
    Wang, Qi
    Pu, Lin
    Di, Lin
    Li, Jie
    Yue, Jinglin
    Han, Junyan
    Zhao, Xuesen
    Yan, Yonghong
    Yu, Fengting
    Wu, Angela R.
    Zhang, Fujie
    Gao, Yi Qin
    Huang, Yanyi
    Wang, Jianbin
    Zeng, Hui
    Chen, Chen
    CELL REPORTS, 2022, 38 (02):
  • [16] Spatio-temporal dynamics of intra-host variability in SARS-CoV-2 genomes
    Pathak, Ankit K.
    Mishra, Gyan Prakash
    Uppili, Bharathram
    Walia, Safal
    Fatihi, Saman
    Abbas, Tahseen
    Banu, Sofia
    Ghosh, Arup
    Kanampalliwar, Amol
    Jha, Atimukta
    Fatma, Sana
    Aggarwal, Shifu
    Dhar, Mahesh Shanker
    Marwal, Robin
    Radhakrishnan, Venkatraman Srinivasan
    Ponnusamy, Kalaiarasan
    Kabra, Sandhya
    Rakshit, Partha
    Bhoyar, Rahul C.
    Jain, Abhinav
    Divakar, Mohit Kumar
    Imran, Mohamed
    Faruq, Mohammed
    Sowpati, Divya Tej
    Thukral, Lipi
    Raghav, Sunil K.
    Mukerji, Mitali
    NUCLEIC ACIDS RESEARCH, 2022, 50 (03) : 1551 - 1561
  • [17] SARS-CoV-2 intra-host evolution during prolonged infection in an immunocompromised patient
    Quaranta, Erika Giorgia
    Fusaro, Alice
    Giussani, Edoardo
    D'Amico, Valeria
    Varotto, Maria
    Pagliari, Matteo
    Giordani, Maria Teresa
    Zoppelletto, Maira
    Merola, Francesca
    Antico, Antonio
    Stefanelli, Paola
    Terregino, Calogero
    Monne, Isabella
    INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES, 2022, 122 : 444 - 448
  • [18] Rapid intra-host diversification and evolution of SARS-CoV-2 in advanced HIV infection
    Ko, Sung Hee
    Radecki, Pierce
    Belinky, Frida
    Bhiman, Jinal N.
    Meiring, Susan
    Kleynhans, Jackie
    Amoako, Daniel
    Canedo, Vanessa Guerra
    Lucas, Margaret
    Kekana, Dikeledi
    Martinson, Neil
    Lebina, Limakatso
    Everatt, Josie
    Tempia, Stefano
    Bylund, Tatsiana
    Rawi, Reda
    Kwong, Peter D.
    Wolter, Nicole
    von Gottberg, Anne
    Cohen, Cheryl
    Boritz, Eli A.
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [19] ACoRE: Accurate SARS-CoV-2 genome reconstruction for the characterization of intra-host and inter-host viral diversity in clinical samples and for the evaluation of re-infections
    Marcolungo, Luca
    Beltrami, Cristina
    Degli Esposti, Chiara
    Lopatriello, Giulia
    Piubelli, Chiara
    Mori, Antonio
    Pomari, Elena
    Deiana, Michela
    Scarso, Salvatore
    Bisoffi, Zeno
    Grosso, Valentina
    Cosentino, Emanuela
    Maestri, Simone
    Lavezzari, Denise
    Iadarola, Barbara
    Paterno, Marta
    Segala, Elena
    Giovannone, Barbara
    Gallinaro, Martina
    Rossato, Marzia
    Delledonne, Massimo
    GENOMICS, 2021, 113 (04) : 1628 - 1638
  • [20] Persistent SARS-CoV-2 Infection in a Patient With Non-hodgkin Lymphoma: Intra-Host Genomic Diversity Analysis
    Bianco, Angelica
    Capozzi, Loredana
    Del Sambro, Laura
    Simone, Domenico
    Pace, Lorenzo
    Rondinone, Valeria
    Difato, Laura M.
    Miccolupo, Angela
    Manzari, Caterina
    Fedele, Alberto
    Parisi, Antonio
    FRONTIERS IN VIROLOGY, 2022, 2