Repeat- and error-aware comparison of deletions

被引:9
|
作者
Wittler, Roland [1 ,2 ]
Marschall, Tobias [3 ,4 ]
Schonhuth, Alexander [5 ]
Makinen, Veli [6 ]
机构
[1] Univ Bielefeld, Fac Technol, Genome Informat, Bielefeld, Germany
[2] Univ Bielefeld, Ctr Biotechnol CeBiTec, Bielefeld, Germany
[3] Univ Saarland, Ctr Bioinformat, D-66123 Saarbrucken, Germany
[4] Max Planck Inst Informat, Dept Computat Biol & Appl Algorithm, D-66123 Saarbrucken, Germany
[5] Ctr Wiskunde & Informat, Life Sci Grp, Amsterdam, Netherlands
[6] Univ Helsinki, Dept Comp Sci, Helsinki Inst Informat Technol, FIN-00014 Helsinki, Finland
关键词
VARIANTS; PARALLEL; DATABASE;
D O I
10.1093/bioinformatics/btv304
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The number of reported genetic variants is rapidly growing, empowered by ever faster accumulation of next-generation sequencing data. A major issue is comparability. Standards that address the combined problem of inaccurately predicted breakpoints and repeat-induced ambiguities are missing. This decisively lowers the quality of 'consensus' callsets and hampers the removal of duplicate entries in variant databases, which can have deleterious effects in downstream analyses. Results: We introduce a sound framework for comparison of deletions that captures both toolinduced inaccuracies and repeat-induced ambiguities. We present a maximum matching algorithm that outputs virtual duplicates among two sets of predictions/annotations. We demonstrate that our approach is clearly superior over ad hoc criteria, like overlap, and that it can reduce the redundancy among callsets substantially. We also identify large amounts of duplicate entries in the Database of Genomic Variants, which points out the immediate relevance of our approach.
引用
下载
收藏
页码:2947 / 2954
页数:8
相关论文
共 50 条
  • [1] Error-aware design
    Kurdahi, Fadi
    Eltawil, Ahmed
    Djahromi, Amin K.
    Makhzan, Mohammad
    Cheng, Stanley
    DSD 2007: 10TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN ARCHITECTURES, METHODS AND TOOLS, PROCEEDINGS, 2007, : 8 - 15
  • [3] Estimation error-aware query optimization: an overview
    Moumen, Chiraz
    Morvan, Franck
    Hameurlain, Abdelkader
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2016, 31 (03): : 185 - 195
  • [4] Error-Aware Interactive Semantic Parsing of OpenStreetMap
    Staniek, Michael
    Riezler, Stefan
    SPLU-ROBONLP 2021: THE 2ND INTERNATIONAL COMBINED WORKSHOP ON SPATIAL LANGUAGE UNDERSTANDING AND GROUNDED COMMUNICATION FOR ROBOTICS, 2021, : 53 - 59
  • [5] Error-Aware SCFlip Decoding of Polar Codes
    Yang, Daeyeol
    Yang, Kyeongcheol
    IEEE ACCESS, 2020, 8 (08): : 163758 - 163768
  • [6] Mining with noise knowledge: Error-aware data mining
    Wu, Xindong
    Zhu, Xingquan
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2008, 38 (04): : 917 - 932
  • [7] Error-aware scheduling and its effect on efficiency and fairness
    Serrano, P
    Larrabeiti, D
    Urueña, M
    Marques, AG
    EUNICE 2005: NETWORKS AND APPLICATIONS TOWARDS A UBIQUITOUSLY CONNECTED WORLD, 2006, 196 : 145 - +
  • [8] An error-aware and energy efficient routing protocol in MANETs
    Tan, Liansheng
    Yang, Peng
    Chan, Sammy
    PROCEEDINGS - 16TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS, VOLS 1-3, 2007, : 724 - +
  • [9] Maximum Error-Aware Design of Approximate Array Multipliers
    Shirane, Kenta
    Yamamoto, Takahiro
    Taniguchi, Ittetsu
    Hara-Azumi, Yuko
    Yamashita, Shigeru
    Tomiyama, Hiroyuki
    2019 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2019, : 73 - 74
  • [10] Integrating Entity Attributes for Error-Aware Knowledge Graph Embedding
    Zhang, Qinggang
    Dong, Junnan
    Tan, Qiaoyu
    Huang, Xiao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1667 - 1682