WindowMasker:: window-based masker for sequenced genomes

被引:190
|
作者
Morgulis, A [1 ]
Gertz, EM [1 ]
Schäffer, AA [1 ]
Agarwala, R [1 ]
机构
[1] Natl Ctr Biotechnol Informat, Natl Inst Hlth, Dept Hlth & Human Serv, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/bti774
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Matches to repetitive sequences are usually undesirable in the output of DNA database searches. Repetitive sequences need not be matched to a query, if they can be masked in the database. RepeatMasker/Maskeraid (RM), currently the most widely used software for DNA sequence masking, is slow and requires a library of repetitive template sequences, such as a manually curated RepBase library, that may not exist for newly sequenced genomes. Results: We have developed a software tool called WindowMasker (WM) that identifies and masks highly repetitive DNA sequences in a genome, using only the sequence of the genome itself. WM is orders of magnitude faster than RM because WM uses a few linear-time scans of the genome sequence, rather than local alignment methods that compare each library sequence with each piece of the genome. We validate WM by comparing BLAST outputs from large sets of queries applied to two versions of the same genome, one masked by WM, and the other masked by RM. Even for genomes such as the human genome, where a good RepBase library is available, searching the database as masked with WM yields more matches that are apparently non-repetitive and fewer matches to repetitive sequences. We show that these results hold for transcribed regions as well. WM also performs well on genomes for which much of the sequence was in draft form at the time of the analysis.
引用
收藏
页码:134 / 141
页数:8
相关论文
共 50 条
  • [21] Rice genomes sequenced
    Chemical and Engineering News, 2002, 80 (14):
  • [22] Window-based graphics frame store architecture
    Westmore, Richard J.
    ACM Transactions on Graphics, 1988, 7 (03): : 233 - 248
  • [23] A theory of window-based unicast congestion control
    Sastry, NR
    Lam, SS
    10TH IEEE INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS, PROCEEDINGS, 2002, : 144 - 154
  • [24] A process window-based approach to mask optimization
    Luminescent Technologies Inc., Palo Alto, CA
    Solid State Technol, 2006, 10 (58-60):
  • [25] Window-based approach for fast stereo correspondence
    Gupta, Raj Kumar
    Cho, Siu-Yeung
    IET COMPUTER VISION, 2013, 7 (02) : 123 - 134
  • [26] A window-based time series feature extraction method
    Katircioglu-Ozturk, Deniz
    Guvenir, H. Altay
    Ravens, Ursula
    Baykal, Nazife
    COMPUTERS IN BIOLOGY AND MEDICINE, 2017, 89 : 466 - 486
  • [27] WSGP: A Window-based Streaming Graph Partitioning Approach
    Li, Yunbo
    Li, Chuanyou
    Orgerie, Anne-Cecile
    Parvedy, Philippe Raipin
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 586 - 595
  • [28] A Window-Based Classifier for Automatic Video-Based Reidentification
    Figueira, Dario
    Taiana, Matteo
    Nascimento, Jacinto C.
    Bernardino, Alexandre
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2016, 46 (12): : 1736 - 1747
  • [29] Loss-resilient window-based congestion control
    De Vleeschouwer, Christophe
    Frossard, Pascal
    COMPUTER NETWORKS, 2008, 52 (07) : 1473 - 1491
  • [30] A new design methodology for window-based FIR filters
    Avanzato, R.
    Beritelli, F.
    Capizzi, G.
    Sciuto, G. Lo
    ELECTRONICS LETTERS, 2023, 59 (11)