Detecting inherited and novel structural variants in low-coverage parent-child sequencing data

被引:3
|
作者
Spence, Melissa [1 ]
Banuelos, Mario [2 ]
Marcia, Roummel F. [1 ]
Sindi, Suzanne [1 ]
机构
[1] Univ Calif Merced, Dept Appl Math, Merced, CA 95343 USA
[2] Calif State Univ Fresno, Dept Math, Fresno, CA 93740 USA
关键词
Sparse signal recovery; Convex optimization; Next-generation sequencing data; Structural variants; Computational genomics; HUMAN GENOME; PAIRED-END; CANCER; IMPACT; BREAST;
D O I
10.1016/j.ymeth.2019.06.025
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Structural variants (SVs) are a class of genomic variation shared by members of the same species. Though relatively rare, they represent an increasingly important class of variation, as SVs have been associated with diseases and susceptibility to some types of cancer. Common approaches to SV detection require the sequencing and mapping of fragments from a test genome to a high-quality reference genome. Candidate SVs correspond to fragments with discordant mapped configurations. However, because errors in the sequencing and mapping will also create discordant arrangements, many of these predictions will be spurious. When sequencing coverage is low, distinguishing true SVs from errors is even more challenging. In recent work, we have developed SV detection methods that exploit genome information of closely related individuals - parents and children. Our previous approaches were based on the assumption that any SV present in a child's genome must have come from one of their parents. However, using this strict restriction may have resulted in failing to predict any rare but novel variants present only in the child. In this work, we generalize our previous approaches to allow the child to carry novel variants. We consider a constrained optimization approach where variants in the child are of two types either inherited - and therefore must be present in a parent - or novel. For simplicity, we consider only a single parent and single child each of which have a haploid genome. However, even in this restricted case, our approach has the power to improve variant prediction. We present results on both simulated candidate variant regions, parent-child trios from the 1000 Genomes Project, and a subset of the 17 Platinum Genomes.
引用
收藏
页码:61 / 68
页数:8
相关论文
共 50 条
  • [1] Detecting Pathogenic Structural Variants with Low-Coverage PacBio Sequencing
    Hickey, L.
    Wenger, A. M.
    Baybayan, P.
    Peluso, P.
    Korlach, J.
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 : 729 - 729
  • [2] DETECTING NOVEL STRUCTURAL VARIANTS IN GENOMES BY LEVERAGING PARENT-CHILD RELATEDNESS
    Spence, Melissa
    Banuelos, Mario
    Marcia, Roummel F.
    Sindi, Suzanne
    [J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 943 - 950
  • [3] Predicting Novel and Inherited Variants in Parent-Child Trios
    Spence, Melissa
    Banuelos, Mario
    Marcia, Roummel F.
    Sindi, Suzanne
    [J]. 2019 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS (MEMEA), 2019,
  • [4] NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data
    Fang, Li
    Hu, Jiang
    Wang, Depeng
    Wang, Kai
    [J]. BMC BIOINFORMATICS, 2018, 19
  • [5] NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data
    Li Fang
    Jiang Hu
    Depeng Wang
    Kai Wang
    [J]. BMC Bioinformatics, 19
  • [6] Kinship Estimation Based on Extremely Low-Coverage Sequencing Data
    Dou, Jinzhuang
    Chothani, Sonia
    Sim, Xueling
    Hughes, Jason D.
    Reilly, Dermot F.
    Tai, E. Shyong
    Liu, Jianjun
    Wang, Chaolong
    [J]. GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 619 - 620
  • [7] Detecting selection in low-coverage high-throughput sequencing data using principal component analysis
    Meisner, Jonas
    Albrechtsen, Anders
    Hanghoj, Kristian
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)
  • [8] Detecting selection in low-coverage high-throughput sequencing data using principal component analysis
    Jonas Meisner
    Anders Albrechtsen
    Kristian Hanghøj
    [J]. BMC Bioinformatics, 22
  • [9] A protocol for applying low-coverage whole-genome sequencing data in structural variation studies
    Liu, Qi
    Xie, Bo
    Gao, Yang
    Xu, Shuhua
    Lu, Yan
    [J]. STAR PROTOCOLS, 2023, 4 (03):
  • [10] A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data
    Miao Zhang
    Yiwen Liu
    Hua Zhou
    Joseph Watkins
    Jin Zhou
    [J]. BMC Bioinformatics, 22