WindowMasker:: window-based masker for sequenced genomes

被引:190
|
作者
Morgulis, A [1 ]
Gertz, EM [1 ]
Schäffer, AA [1 ]
Agarwala, R [1 ]
机构
[1] Natl Ctr Biotechnol Informat, Natl Inst Hlth, Dept Hlth & Human Serv, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/bti774
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Matches to repetitive sequences are usually undesirable in the output of DNA database searches. Repetitive sequences need not be matched to a query, if they can be masked in the database. RepeatMasker/Maskeraid (RM), currently the most widely used software for DNA sequence masking, is slow and requires a library of repetitive template sequences, such as a manually curated RepBase library, that may not exist for newly sequenced genomes. Results: We have developed a software tool called WindowMasker (WM) that identifies and masks highly repetitive DNA sequences in a genome, using only the sequence of the genome itself. WM is orders of magnitude faster than RM because WM uses a few linear-time scans of the genome sequence, rather than local alignment methods that compare each library sequence with each piece of the genome. We validate WM by comparing BLAST outputs from large sets of queries applied to two versions of the same genome, one masked by WM, and the other masked by RM. Even for genomes such as the human genome, where a good RepBase library is available, searching the database as masked with WM yields more matches that are apparently non-repetitive and fewer matches to repetitive sequences. We show that these results hold for transcribed regions as well. WM also performs well on genomes for which much of the sequence was in draft form at the time of the analysis.
引用
收藏
页码:134 / 141
页数:8
相关论文
共 50 条
  • [1] WINDOW-BASED SURVEILLANCE STRATEGIES
    KRISHNA, CM
    GANZ, A
    WANG, X
    IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES, 1995, 142 (03): : 233 - 236
  • [2] WINDOW-BASED TOPIC MODEL FOR HDP
    Liu, Di
    Zeng, Ye
    Luo, Yu
    Pang, Hong
    Wu, Xiao-Hua
    2019 16TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICWAMTIP), 2019, : 70 - 75
  • [3] A window-based inverse Hough transform
    Kesidis, AL
    Papamarkos, N
    PATTERN RECOGNITION, 2000, 33 (06) : 1105 - 1117
  • [4] Window-based, discontinuity preserving stereo
    Agrawal, M
    Davis, LS
    PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, 2004, : 66 - 73
  • [5] A window-based algorithm for skyline queries
    Yu, J
    Liu, X
    Liu, GH
    PDCAT 2005: Sixth International Conference on Parallel and Distributed Computing, Applications and Technologies, Proceedings, 2005, : 907 - 909
  • [6] Window-Based Constant Beamwidth Beamformer
    Long, Tao
    Cohen, Israel
    Berdugo, Baruch
    Yang, Yan
    Chen, Jingdong
    SENSORS, 2019, 19 (09)
  • [7] Window-based method for information retrieval
    Jin, QL
    Zhao, J
    Xu, B
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 120 - 129
  • [8] Window-based capillary flow porometer
    不详
    CANADIAN CERAMICS QUARTERLY-JOURNAL OF THE CANADIAN CERAMIC SOCIETY, 1996, 65 (02): : 94 - 94
  • [9] Window-based image registration using variable window sizes
    Krutz, Andreas
    Frater, Michael
    Sikora, Thomas
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 2621 - +
  • [10] Window-based Streaming Graph Partitioning Algorithm
    Patwary, Md Anwarul Kaium
    Garg, Saurabh
    Kang, Byeong
    PROCEEDINGS OF THE AUSTRALASIAN COMPUTER SCIENCE WEEK MULTICONFERENCE (ACSW 2019), 2019,