Konnector: Connecting Paired-end Reads Using a Bloom Filter de Bruijn Graph

被引:0
|
作者
Vandervalk, Benjamin P. [1 ]
Jackman, Shaun D. [1 ]
Raymond, Anthony [1 ]
Mohamadi, Hamid [1 ]
Yang, Chen [1 ]
Attali, Dean A. [1 ]
Chu, Justin [1 ]
Warren, Rene L. [1 ]
Birol, Inanc [1 ]
机构
[1] BC Canc Agcy, Genome Sci Ctr, Vancouver, BC, Canada
关键词
Bloom filter; de Bruijn graph; paired-end sequencing; de novo genome assembly; DNA-SEQUENCES; ALIGNMENT; GENOMES; TOOL;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Paired-end sequencing yields a read from each end of a DNA fragment, typically leaving a gap of unsequenced nucleotides in the middle. Closing this gap using information from other reads in the same sequencing experiment offers the potential to generate longer "pseudo-reads" using short read sequencing platforms. Such long reads may benefit downstream applications such as de novo sequence assembly, gap filling, and variant detection. With these possible applications in mind, we have developed Konnector, a software tool to fill in the nucleotides of the sequence gap between read pairs by navigating a de Bruijn graph. Konnector represents the de Bruijn graph using a Bloom filter, a probabilistic and memory-efficient data structure. Our implementation is able to store the de Bruijn graph using a mean 1.5 bytes of memory per k-mer, which represents a marked improvement over the typical hash table data structure. The memory usage per k-mer is independent of the k-mer length, enabling application of the tool to large genomes. We report the performance of the tool on simulated and experimental datasets, and discuss its utility for downstream analysis. Availability-Konnector is open-source software, free for academic use, released under the British Columbia Cancer Agency's academic license. The tool is included with ABySS version 1.5.2 and later, and is available for download from http://www.bcgsc.ca/platform/bioinfo/software/abyss.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] ELOPER: elongation of paired-end reads as a pre-processing tool for improved de novo genome assembly
    Silver, David H.
    Ben-Elazar, Shay
    Bogoslavsky, Alexei
    Yanai, Itai
    BIOINFORMATICS, 2013, 29 (11) : 1455 - 1457
  • [32] Inferring short tandem repeat variation from paired-end short reads
    Minh Duc Cao
    Tasker, Edward
    Willadsen, Kai
    Imelfort, Michael
    Vishwanathan, Sailaja
    Sureshkumar, Sridevi
    Balasubramanian, Sureshkumar
    Boden, Mikael
    NUCLEIC ACIDS RESEARCH, 2014, 42 (03) : e16
  • [33] An approximate Bayesian approach for mapping paired-end DNA reads to a reference genome
    Shrestha, Anish Man Singh
    Frith, Martin C.
    BIOINFORMATICS, 2013, 29 (08) : 965 - 972
  • [34] An optimized approach for local de novo assembly of overlapping paired-end RAD reads from multiple individuals
    Li, Yu-Long
    Xue, Dong-Xiu
    Zhang, Bai-Dong
    Liu, Jin-Xian
    ROYAL SOCIETY OPEN SCIENCE, 2018, 5 (02):
  • [35] OVarCall: Bayesian Mutation Calling Method Utilizing Overlapping Paired-End Reads
    Moriyama, Takuya
    Shiraishi, Yuichi
    Chiba, Kenichi
    Yamaguchi, Rui
    Imoto, Seiya
    Miyano, Satoru
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2016, 2016, 9683 : 40 - 51
  • [36] IMperm: a fast and comprehensive IMmune Paired-End Reads Merger for sequencing data
    Zhang, Wei
    Ju, Jia
    Zhou, Yong
    Xiong, Teng
    Wang, Mengyao
    Li, Chaohui
    Lu, Shixin
    Lu, Zefeng
    Lin, Liya
    Liu, Xiao
    Li, Shuai Cheng
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)
  • [37] OVarCall: Bayesian Mutation Calling Method Utilizing Overlapping Paired-End Reads
    Moriyama, Takuya
    Shiraishi, Yuichi
    Chiba, Kenichi
    Yamaguchi, Rui
    Imoto, Seiya
    Miyano, Satoru
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2017, 16 (02) : 116 - 122
  • [38] MEC: Misassembly Error Correction in contigs using a combination of paired-end reads and GC-contents
    Wu, Binbin
    Wang, Jianxin
    Luo, Junwei
    Li, Min
    Wu, Fangxiang
    Pan, Yi
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 216 - 221
  • [39] Elucidation of genomic organizations of transgenic soybean plants through de novo genome assembly with short paired-end reads
    Kim, Myung-Shin
    Jo, Hojin
    Kim, Ji Hong
    Bae, Dong Nyuk
    Pack, In-Soon
    Kim, Chang-Gi
    Kwon, Tackmin
    Nam, Jaesung
    Chung, Young-Soo
    Jeong, Soon-Chun
    MOLECULAR BREEDING, 2021, 41 (01)
  • [40] Elucidation of genomic organizations of transgenic soybean plants through de novo genome assembly with short paired-end reads
    Myung-Shin Kim
    Hojin Jo
    Ji Hong Kim
    Dong Nyuk Bae
    In-Soon Pack
    Chang-Gi Kim
    Tackmin Kwon
    Jaesung Nam
    Young-Soo Chung
    Soon-Chun Jeong
    Molecular Breeding, 2021, 41