A tool for aligning very similar DNA sequences

被引:0
|
作者
Chao, KM
Zhang, JH
Ostell, J
Miller, W
机构
[1] PENN STATE UNIV, DEPT COMP SCI & ENGN, UNIVERSITY PK, PA 16802 USA
[2] PROVIDENCE UNIV, DEPT COMP SCI & INFORMAT MANAGEMENT, TAICHUNG 43309, TAIWAN
[3] NIH, NATL CTR BIOTECHNOL INFORMAT, NATL LIB MED, BETHESDA, MD 20892 USA
来源
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Results: We have produced a computer program, named sim3, that solves the following computational problem. Two DNA sequences are given, where the shorter sequence is very similar to some contiguous region of the longer sequence. Sim3 determines such a similar region of the longer sequence, and then computes an optimal set of single-nucleotide changes (i.e., insertions, deletions or substitutions) that will convert the shorter sequence to that region. Thus, the alignment scoring scheme is designed to model sequencing errors, rather than evolutionary processes. The program can align a 100 kb sequence to a I megabase sequence in a few seconds on a workstation, provided that there are very few differences between the shorter sequence and some region in the longer sequence. The program has been used to assemble sequence data for the Genomes Division at the National Center for Biotechnology Information. Availability: A version of sim3 for UNIX machines can be obtained by anonymous ftp from ncbi. nlm. nih, gov, in the pub/sim3 directory. Contact: For portable versions for Macs and PCs, contact zjing@sunset. nlm. nih. gov.
引用
收藏
页码:75 / 80
页数:6
相关论文
共 50 条
  • [1] A LOCAL ALIGNMENT TOOL FOR VERY LONG DNA-SEQUENCES
    CHAO, KM
    ZHANG, JH
    OSTELL, J
    MILLER, W
    COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1995, 11 (02): : 147 - 153
  • [2] FramePlus: aligning DNA to protein sequences
    Halperin, E
    Faigler, S
    Gill-More, R
    BIOINFORMATICS, 1999, 15 (11) : 867 - 873
  • [3] A greedy algorithm for aligning DNA sequences
    Zhang, Z
    Schwartz, S
    Wagner, L
    Miller, W
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (1-2) : 203 - 214
  • [4] Aligning DNA sequences to minimize the change in protein
    Hua, YF
    Jiang, T
    Wu, B
    JOURNAL OF COMBINATORIAL OPTIMIZATION, 1999, 3 (2-3) : 227 - 245
  • [5] Aligning DNA Sequences to Minimize the Change in Protein
    Yufang Hua
    Tao Jiang
    Bin Wu
    Journal of Combinatorial Optimization, 1999, 3 : 227 - 245
  • [6] Indexing Similar DNA Sequences
    Huang, Songbo
    Lam, T. W.
    Sung, W. K.
    Tam, S. L.
    Yiu, S. M.
    ALGORITHMIC ASPECTS IN INFORMATION AND MANAGEMENT, 2010, 6124 : 180 - +
  • [7] PyNAST: a flexible tool for aligning sequences to a template alignment
    Caporaso, J. Gregory
    Bittinger, Kyle
    Bushman, Frederic D.
    DeSantis, Todd Z.
    Andersen, Gary L.
    Knight, Rob
    BIOINFORMATICS, 2010, 26 (02) : 266 - 267
  • [8] An improved quick algorithm for aligning DNA/RNA sequences
    Zou, Quan
    Guo, Maozu
    Liu, Yang
    Zhang, Taotao
    CIS WORKSHOPS 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY WORKSHOPS, 2007, : 825 - 828
  • [9] SPARK-MSNA: Efficient algorithm on Apache Spark for aligning multiple similar DNA/RNA sequences with supervised learning
    V. Vineetha
    C. L. Biji
    Achuthsankar S. Nair
    Scientific Reports, 9
  • [10] SPARK-MSNA: Efficient algorithm on Apache Spark for aligning multiple similar DNA/RNA sequences with supervised learning
    Vineetha, V.
    Biji, C. L.
    Nair, Achuthsankar S.
    SCIENTIFIC REPORTS, 2019, 9 (1)