Haplotype-aware Variant Selection for Genome Graphs

被引:1
|
作者
Tavakoli, Neda [1 ]
Gibney, Daniel [1 ]
Aluru, Srinivas [1 ]
机构
[1] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
Variation graphs; variant selection; haplotype-aware; SNPs; ILP-based optimization; FRAMEWORK; ALGORITHM; ALIGNMENT;
D O I
10.1145/3535508.3545556
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph-based genome representations have proven to be a powerful tool in genomic analysis due to their ability to encode variations found in multiple haplotypes and capture population genetic diversity. Such graphs also unavoidably contain paths which switch between haplotypes (i.e., recombinant paths) and thus do not fully match any of the constituent haplotypes. The number of such recombinant paths increases combinatorially with path length and cause inefficiencies and false positives when mapping reads. In this paper, we study the problem of finding reduced haplotypeaware genome graphs that incorporate only a selected subset of variants, yet contain paths corresponding to all alpha -long substrings of the input haplotypes (i.e., non-recombinant paths) with at most delta mismatches. Solving this problem optimally, i.e., minimizing the number of variants selected, is previously known to be NP-hard [14]. Here, we first establish several inapproximability results regarding finding haplotype-aware reduced variation graphs of optimal size. We then present an integer linear programming (ILP) formulation for solving the problem, and experimentally demonstrate this is a computationally feasible approach for real-world problems and provides far superior reduction compared to prior approaches.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads
    Kishwar Shafin
    Trevor Pesout
    Pi-Chuan Chang
    Maria Nattestad
    Alexey Kolesnikov
    Sidharth Goel
    Gunjan Baid
    Mikhail Kolmogorov
    Jordan M. Eizenga
    Karen H. Miga
    Paolo Carnevali
    Miten Jain
    Andrew Carroll
    Benedict Paten
    Nature Methods, 2021, 18 : 1322 - 1332
  • [22] Diploid genome assembly of the Malbec grapevine cultivar enables haplotype-aware analysis of transcriptomic differences underlying clonal phenotypic variation
    Calderon, Luciano
    Carbonell-Bejerano, Pablo
    Munoz, Claudio
    Bree, Laura
    Sola, Cristobal
    Bergamin, Daniel
    Tulle, Walter
    Gomez-Talquenca, Sebastian
    Lanz, Christa
    Royo, Carolina
    Ibanez, Javier
    Martinez-Zapater, Jose Miguel
    Weigel, Detlef
    Lijavetzky, Diego
    HORTICULTURE RESEARCH, 2024, 11 (05)
  • [23] CCS-Consensuser: A Haplotype-Aware Consensus Generator for PacBio Amplicon Sequences
    Congrains, Carlos
    Bremer, Forest
    Dupuis, Julian R.
    Barr, Norman B.
    Garzon-Orduna, Ivonne J.
    Rubinoff, Daniel
    Doorenweerd, Camiel
    Jose, Michael San
    Morris, Kimberley
    Kauwe, Angela
    Geib, Scott
    MOLECULAR ECOLOGY RESOURCES, 2025,
  • [24] phasebook: haplotype-aware de novo assembly of diploid genomes from long reads
    Luo, Xiao
    Kang, Xiongbin
    Schoenhuth, Alexander
    GENOME BIOLOGY, 2021, 22 (01)
  • [25] phasebook: haplotype-aware de novo assembly of diploid genomes from long reads
    Xiao Luo
    Xiongbin Kang
    Alexander Schönhuth
    Genome Biology, 22
  • [26] Haplotype-aware analysis of somatic copy number variations from single -cell transcriptomes
    Gao, Teng
    Soldatov, Ruslan
    Sarkar, Hirak
    Kurkiewicz, Adam
    Biederstedt, Evan
    Loh, Po-Ru
    Kharchenko, Peter, V
    NATURE BIOTECHNOLOGY, 2023, 41 (03) : 417 - +
  • [27] A haplotype-aware de novo assembly of related individuals using pedigree sequence graph
    Garg, Shilpa
    Aach, John
    Li, Heng
    Sebenius, Isaac
    Durbin, Richard
    Church, George
    BIOINFORMATICS, 2020, 36 (08) : 2385 - 2392
  • [28] Haplotype-aware modeling of cis-regulatory effects highlights the gaps remaining in eQTL data
    Nava Ehsan
    Bence M. Kotis
    Stephane E. Castel
    Eric J. Song
    Nicholas Mancuso
    Pejman Mohammadi
    Nature Communications, 15
  • [29] Haplotype-aware modeling of cis-regulatory effects highlights the gaps remaining in eQTL data
    Ehsan, Nava
    Kotis, Bence M.
    Castel, Stephane E.
    Song, Eric J.
    Mancuso, Nicholas
    Mohammadi, Pejman
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [30] NanoSNP: a progressive and haplotype-aware SNP caller on low-coverage nanopore sequencing data
    Huang, Neng
    Xu, Minghua
    Nie, Fan
    Ni, Peng
    Xiao, Chuan-Le
    Luo, Feng
    Wang, Jianxin
    BIOINFORMATICS, 2023, 39 (01)