Aligning Sequences by Minimum Description Length

被引:4
|
作者
Conery, John S. [1 ]
机构
[1] Univ Oregon, Dept Comp & Informat Sci, Eugene, OR 97403 USA
基金
爱尔兰科学基金会; 美国国家科学基金会; 美国国家卫生研究院;
关键词
D O I
10.1155/2007/72936
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents a new information theoretic framework for aligning sequences in bioinformatics. A transmitter compresses a set of sequences by constructing a regular expression that describes the regions of similarity in the sequences. To retrieve the original set of sequences, a receiver generates all strings that match the expression. An alignment algorithm uses minimum description length to encode and explore alternative expressions; the expression with the shortest encoding provides the best overall alignment. When two substrings contain letters that are similar according to a substitution matrix, a code length function based on conditional probabilities defined by the matrix will encode the substrings with fewer bits. In one experiment, alignments produced with this new method were found to be comparable to alignments from CLUSTALW. A second experiment measured the accuracy of the new method on pairwise alignments of sequences from the BAliBASE alignment benchmark. Copyright (C) 2007 John S. Conery.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Aligning shapes by minimising the description length
    Ericsson, A
    Karlsson, J
    [J]. IMAGE ANALYSIS, PROCEEDINGS, 2005, 3540 : 709 - 718
  • [2] Minimum description length revisited
    Grunwald, Peter
    Roos, Teemu
    [J]. INTERNATIONAL JOURNAL OF MATHEMATICS FOR INDUSTRY, 2019, 11 (01):
  • [3] Spherical Minimum Description Length
    Herntier, Trevor
    Ihou, Koffi Eddy
    Smith, Anthony
    Rangarajan, Anand
    Peter, Adrian
    [J]. ENTROPY, 2018, 20 (08):
  • [4] A new minimum description length
    Beheshti, S
    Dahleh, MA
    [J]. PROCEEDINGS OF THE 2003 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2003, : 1602 - 1607
  • [5] Minimum description length tutorial
    Grünwald, P
    [J]. ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 23 - 79
  • [6] Introducing the minimum description length principle
    Grünwald, P
    [J]. ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 3 - 21
  • [7] Minimum Description Length and Clustering with Exemplars
    Lai, Po-Hsiang
    O'Sullivan, Joseph A.
    Pless, Robert
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, VOLS 1- 4, 2009, : 1318 - +
  • [8] Minimum Description Length Codes Are Critical
    Cubero, Ryan John
    Marsili, Matteo
    Roudi, Yasser
    [J]. ENTROPY, 2018, 20 (10):
  • [9] A minimum description length principle for perception
    Chater, N
    [J]. ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 385 - 409
  • [10] Minimum description length with local geometry
    Styner, Martin
    Oguz, Ipek
    Heimann, Tobias
    Gerig, Guido
    [J]. 2008 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, VOLS 1-4, 2008, : 1283 - +