Detecting inherited and novel structural variants in low-coverage parent-child sequencing data

被引:3
|
作者
Spence, Melissa [1 ]
Banuelos, Mario [2 ]
Marcia, Roummel F. [1 ]
Sindi, Suzanne [1 ]
机构
[1] Univ Calif Merced, Dept Appl Math, Merced, CA 95343 USA
[2] Calif State Univ Fresno, Dept Math, Fresno, CA 93740 USA
关键词
Sparse signal recovery; Convex optimization; Next-generation sequencing data; Structural variants; Computational genomics; HUMAN GENOME; PAIRED-END; CANCER; IMPACT; BREAST;
D O I
10.1016/j.ymeth.2019.06.025
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Structural variants (SVs) are a class of genomic variation shared by members of the same species. Though relatively rare, they represent an increasingly important class of variation, as SVs have been associated with diseases and susceptibility to some types of cancer. Common approaches to SV detection require the sequencing and mapping of fragments from a test genome to a high-quality reference genome. Candidate SVs correspond to fragments with discordant mapped configurations. However, because errors in the sequencing and mapping will also create discordant arrangements, many of these predictions will be spurious. When sequencing coverage is low, distinguishing true SVs from errors is even more challenging. In recent work, we have developed SV detection methods that exploit genome information of closely related individuals - parents and children. Our previous approaches were based on the assumption that any SV present in a child's genome must have come from one of their parents. However, using this strict restriction may have resulted in failing to predict any rare but novel variants present only in the child. In this work, we generalize our previous approaches to allow the child to carry novel variants. We consider a constrained optimization approach where variants in the child are of two types either inherited - and therefore must be present in a parent - or novel. For simplicity, we consider only a single parent and single child each of which have a haploid genome. However, even in this restricted case, our approach has the power to improve variant prediction. We present results on both simulated candidate variant regions, parent-child trios from the 1000 Genomes Project, and a subset of the 17 Platinum Genomes.
引用
收藏
页码:61 / 68
页数:8
相关论文
共 50 条
  • [21] Validation of low-coverage whole genome sequencing assay for detection of copy number aberrations in inherited disorders
    Koskenvuo, J.
    Salmenpera, P.
    Valori, M.
    Scheinin, I.
    Gentile, M.
    Myllykangas, S.
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 : 642 - 642
  • [22] Reveel: large-scale population genotyping using low-coverage sequencing data
    Huang, Lin
    Wang, Bo
    Chen, Ruitang
    Bercovici, Sivan
    Batzoglou, Serafim
    [J]. BIOINFORMATICS, 2016, 32 (11) : 1686 - 1696
  • [23] Low-coverage genome sequencing is an efficient approach for the detection of clinically relevant copy-number variants and mtDNA variants
    Pajusalu, Sander
    Oja, Kaisa Teele
    Samarina, Ustina
    Tooming, Mikk
    Kahre, Tiina
    Ounap, Katrin
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2023, 31 : 592 - 592
  • [24] Efficient phasing and imputation of low-coverage sequencing data using large reference panels
    Rubinacci, S.
    Ribeiro, D.
    Hofmeister, R.
    Delaneau, O.
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2020, 28 (SUPPL 1) : 658 - 659
  • [25] Best practices for genotype imputation from low-coverage sequencing data in natural populations
    Watowich, Marina M.
    Chiou, Kenneth L.
    Graves, Brian
    Montague, Michael J.
    Brent, Lauren J. N.
    Higham, James P.
    Horvath, Julie E.
    Lu, Amy
    Martinez, Melween I.
    Platt, Michael L.
    Schneider-Crease, India A.
    Lea, Amanda J.
    Snyder-Mackler, Noah
    [J]. MOLECULAR ECOLOGY RESOURCES, 2023,
  • [26] SNP detection and genotyping from low-coverage sequencing data on multiple diploid samples
    Le, Si Quang
    Durbin, Richard
    [J]. GENOME RESEARCH, 2011, 21 (06) : 952 - 960
  • [27] Efficient phasing and imputation of low-coverage sequencing data using large reference panels
    Simone Rubinacci
    Diogo M. Ribeiro
    Robin J. Hofmeister
    Olivier Delaneau
    [J]. Nature Genetics, 2021, 53 : 120 - 126
  • [28] Efficient phasing and imputation of low-coverage sequencing data using large reference panels
    Rubinacci, Simone
    Ribeiro, Diogo M.
    Hofmeister, Robin J.
    Delaneau, Olivier
    [J]. NATURE GENETICS, 2021, 53 (01) : 120 - 126
  • [29] CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data
    Soylev, Arda
    Cokoglu, Sevim Seda
    Koptekin, Dilek
    Alkan, Can
    Somel, Mehmet
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (12)
  • [30] Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data
    Ozdemir Ozdogan, Gulistan
    Kaya, Hilal
    [J]. INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2020, 12 (03) : 302 - 310