A measure of discrepancy of multiple sequences

被引:27
|
作者
Fang, WW
Roberts, FS
Ma, ZR
机构
[1] Chinese Acad Sci, Inst Appl Math, Acad Math & Syst Sci, Beijing 100080, Peoples R China
[2] Rutgers State Univ, Ctr Discrete Math, Piscataway, NJ 08855 USA
[3] Rutgers State Univ, Theoret Comp Sci Ctr, DIMACS, Piscataway, NJ 08855 USA
[4] Rutgers State Univ, Waksman Inst Microbiol, Piscataway, NJ 08855 USA
关键词
multiple sequence comparison; entropy; DNA; information discrepancy;
D O I
10.1016/S0020-0255(01)00108-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multiple sequence comparison is a basic problem for molecular biology and other sciences. In this paper, we introduce the concept of complete information set and some measurement principles for measuring discrepancy among multiple sequences. Based on them, we present a new measurement method satisfying the principles for comparing multiple sequences. We illustrate that this method can effectively distinguish different random sequences or DNA sequences of length 8000 by comparisons of 6-8 symbol (base) strings or protein sequences of length 8000 by comparisons of 3-4 symbol (amino acid) strings. It can also measure slight changes of a sequence, e.g., insertion or deletion of a symbol (a base or an amino acid) in a sequence. It is applied in the study of molecular evolution, and the elementary result shows a hierarchic relationship among the cytochrome C protein sequences of different species, much as that in taxonomy. (C) 2001 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:75 / 102
页数:28
相关论文
共 50 条
  • [21] A DISCREPANCY MEASURE FOR IMPROVED CLUSTERING
    GUPTA, L
    TAMMANA, R
    [J]. PATTERN RECOGNITION, 1995, 28 (10) : 1627 - 1634
  • [22] THE DISCREPANCY OF 0.1-SEQUENCES
    KIRSCHENHOFER, P
    TICHY, RF
    [J]. JOURNAL OF NUMBER THEORY, 1985, 21 (02) : 156 - 175
  • [23] A new Bayesian discrepancy measure
    Bertolino, Francesco
    Manca, Mara
    Musio, Monica
    Racugno, Walter
    Ventura, Laura
    [J]. STATISTICAL METHODS AND APPLICATIONS, 2024, 33 (02): : 381 - 405
  • [24] DISCREPANCY AND DIAPHONY OF THE GENERALIZED VANDERCORPUT SEQUENCES
    CHAIX, H
    FAURE, H
    [J]. COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE I-MATHEMATIQUE, 1990, 310 (06): : 315 - 320
  • [25] ON THE LP-DISCREPANCY OF CERTAIN SEQUENCES
    KUIPERS, L
    SHIUE, JS
    [J]. FIBONACCI QUARTERLY, 1988, 26 (02): : 157 - 162
  • [26] ON DISCREPANCY OF CERTAIN SEQUENCES MOD 1
    GABAI, H
    [J]. ILLINOIS JOURNAL OF MATHEMATICS, 1967, 11 (01) : 1 - &
  • [27] Discrepancy of point sequences on fractal sets
    Albrecher, H
    Matousek, J
    Tichy, RF
    [J]. PUBLICATIONES MATHEMATICAE-DEBRECEN, 2000, 56 (3-4): : 233 - 249
  • [28] DISCREPANCY AND DIAPHONY OF THE GENERALIZED VANDERCORPUT SEQUENCES
    CHAIX, H
    FAURE, H
    [J]. COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE I-MATHEMATIQUE, 1990, 311 (02): : 65 - 68
  • [29] On sequences with prescribed metric discrepancy behavior
    Aistleitner, Christoph
    Larcher, Gerhard
    [J]. MONATSHEFTE FUR MATHEMATIK, 2016, 181 (03): : 507 - 514
  • [30] Dynamically defined sequences with small discrepancy
    Stefan Steinerberger
    [J]. Monatshefte für Mathematik, 2020, 191 : 639 - 655