A tool for aligning very similar DNA sequences

被引:0
|
作者
Chao, KM
Zhang, JH
Ostell, J
Miller, W
机构
[1] PENN STATE UNIV, DEPT COMP SCI & ENGN, UNIVERSITY PK, PA 16802 USA
[2] PROVIDENCE UNIV, DEPT COMP SCI & INFORMAT MANAGEMENT, TAICHUNG 43309, TAIWAN
[3] NIH, NATL CTR BIOTECHNOL INFORMAT, NATL LIB MED, BETHESDA, MD 20892 USA
来源
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Results: We have produced a computer program, named sim3, that solves the following computational problem. Two DNA sequences are given, where the shorter sequence is very similar to some contiguous region of the longer sequence. Sim3 determines such a similar region of the longer sequence, and then computes an optimal set of single-nucleotide changes (i.e., insertions, deletions or substitutions) that will convert the shorter sequence to that region. Thus, the alignment scoring scheme is designed to model sequencing errors, rather than evolutionary processes. The program can align a 100 kb sequence to a I megabase sequence in a few seconds on a workstation, provided that there are very few differences between the shorter sequence and some region in the longer sequence. The program has been used to assemble sequence data for the Genomes Division at the National Center for Biotechnology Information. Availability: A version of sim3 for UNIX machines can be obtained by anonymous ftp from ncbi. nlm. nih, gov, in the pub/sim3 directory. Contact: For portable versions for Macs and PCs, contact zjing@sunset. nlm. nih. gov.
引用
收藏
页码:75 / 80
页数:6
相关论文
共 50 条
  • [21] Aligning Multi Sequences on GPUs
    Hong Phong Pham
    Huu Duc Nguyen
    Thanh Thuy Nguyen
    Context-Aware Systems and Applications, (ICCASA 2012), 2013, 109 : 300 - 309
  • [22] FAST COMPUTER-SEARCH FOR SIMILAR DNA-SEQUENCES
    BISHOP, M
    THOMPSON, E
    NUCLEIC ACIDS RESEARCH, 1984, 12 (13) : 5471 - 5474
  • [23] On-line String Matching in Highly Similar DNA Sequences
    Nsira N.B.
    Elloumi M.
    Lecroq T.
    Mathematics in Computer Science, 2017, 11 (2) : 113 - 126
  • [24] Vertebrate DM domain proteins bind similar DNA sequences and can heterodimerize on DNA
    Murphy, Mark W.
    Zarkower, David
    Bardwell, Vivian J.
    BMC MOLECULAR BIOLOGY, 2007, 8
  • [25] SWORDS: A statistical tool for analysing large DNA sequences
    Chaudhuri, P
    Das, S
    JOURNAL OF BIOSCIENCES, 2002, 27 (01) : 1 - 6
  • [26] A tool for the design of DNA base sequences for molecular circuits
    Yoshida, Yuki
    Nakakuki, Takashi
    2017 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2017, : 669 - 674
  • [27] MITOMASTER: A Bioinformatics Tool for the Analysis of Mitochondrial DNA Sequences
    Brandon, Marty C.
    Ruiz-Pesini, Eduardo
    Mishmar, Dan
    Procaccio, Vincent
    Lott, Marie T.
    Nguyen, Kevin Cuong
    Spolim, Syawal
    Patil, Upen
    Baldi, Pierre
    Wallace, Douglas C.
    HUMAN MUTATION, 2009, 30 (01) : 1 - 6
  • [28] SWORDS: A statistical tool for analysing large DNA sequences
    Probal Chaudhuri
    Sandip Das
    Journal of Biosciences, 2002, 27 : 1 - 6
  • [29] Aligning Sequences by Minimum Description Length
    Conery, John S.
    EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY, 2007, (01):
  • [30] Aligning Non-Overlapping Sequences
    Yaron Caspi
    Michal Irani
    International Journal of Computer Vision, 2002, 48 : 39 - 51