streammd: fast low-memory duplicate marking using a Bloom filter

被引:1
|
作者
Leonard, Conrad [1 ,2 ]
机构
[1] QIMR Berghofer Med Res Inst, Dept Genome Informat, Herston, Qld 4006, Australia
[2] QIMR Berghofer Med Res Inst, Dept Genome Informat, 300 Herston Rd, Herston, Qld 4006, Australia
关键词
D O I
10.1093/bioinformatics/btad181
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Identification of duplicate templates is a common preprocessing step in bulk sequence analysis; for large libraries, this can be resource intensive. Here, we present streammd: a fast, memory-efficient, single-pass duplicate marker operating on the principle of a Bloom filter. streammd closely reproduces outputs from Picard MarkDuplicates while being substantially faster, and requires much less memory than SAMBLASTER.Availability and implementationstreammd is a C++ program available from GitHub under the MIT license.
引用
收藏
页数:3
相关论文
共 50 条
  • [1] A Low-Memory, Straightforward and Fast Bilateral Filter Through Subsampling in Spatial Domain
    Banterle, Francesco
    Corsini, Massimiliano
    Cignoni, Paolo
    Scopigno, Roberto
    COMPUTER GRAPHICS FORUM, 2012, 31 (01) : 19 - 32
  • [2] SMFrWF: Segmented Modified Fractional Wavelet Filter: Fast Low-Memory Discrete Wavelet Transform (DWT)
    Tausif, Mohd
    Khan, Ekram
    Hasan, Mohd
    Reisslein, Martin
    IEEE ACCESS, 2019, 7 : 84448 - 84467
  • [3] Object detection based on fast and low-memory hybrid background model
    Shimada, Atsushi
    Taniguchi, Rin-Ichiro
    IEEJ Transactions on Electronics, Information and Systems, 2009, 129 (05) : 846 - 852
  • [4] Using media processors for low-memory AES implementation
    Irwin, J
    Page, D
    IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, PROCEEDINGS, 2003, : 144 - 154
  • [5] Fast, low-memory detection and localization of large, polymorphic inversions from SNPs
    Nowling, Ronald J.
    Fallas-Moya, Fabian
    Sadovnik, Amir
    Emrich, Scott
    Aleck, Matthew
    Leskiewicz, Daniel
    Peters, John G.
    PEERJ, 2022, 10
  • [6] Streaming data reduction using low-memory factored representations
    Littau, David
    Boley, Daniel
    INFORMATION SCIENCES, 2006, 176 (14) : 2016 - 2041
  • [7] Fast Malware Classification using Counting Bloom Filter
    Kang, BooJong
    Kim, Hye Seon
    Kim, Taeguen
    Kwon, Heejun
    Im, Eul Gyu
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (07): : 2879 - 2892
  • [8] Low-Memory Implementation of PITD Method Using a Thresholding Scheme
    Zhu, Xiaojie
    Ma, Xikui
    Shao, Jinghui
    IEEE MICROWAVE AND WIRELESS COMPONENTS LETTERS, 2021, 31 (06) : 537 - 540
  • [9] Exciting Determinants in Quantum Monte Carlo: Loading the Dice with Fast, Low-Memory Weights
    Neufeld, Verena A.
    Thom, Alex J. W.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2019, 15 (01) : 127 - 140
  • [10] A New Generation of Fast and Low-Memory Smart Digital/Geometrical Beamforming MIMO Antenna
    Pirapaharan, Kandasamy
    Prabhashana, W. H. Sasinda C.
    Medaranga, S. P. Pramuka
    Hoole, Paul R. P.
    Fernando, Xavier
    ELECTRONICS, 2023, 12 (07)