Analysis and comparision of information theory-based distances for genomic strings

被引:0
|
作者
Balzano, Walter [1 ]
Cicalese, Ferdinando [2 ]
Del Sorbo, Maria Rosaria [3 ]
Vaccaro, Ugo [4 ]
机构
[1] Univ Naples Federico II, Dipartimento Sci Fis, Complesso Univ Monte St Angelo,Via Cintia, I-80126 Naples, Italy
[2] Univ Bielefeld, Tech Fakultaet, AG Genominformat, Bielefeld, Germany
[3] Univ Naples Federico II, Dipartimento Matemat & Applicaz, I-80126 Naples, Italy
[4] Univ Salerno, Dipartimento Informat & Applicaz, I-84084 Fisciano, Italy
关键词
alignment-free genomic string distance; information; entropy;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genomic string comparison via alignment are widely applied for mining and retrieval of information in biological databases. In some situation, the effectiveness of such alignment based comparison is still unclear, e.g., for sequences with non-uniform length and with significant shuffling of identical substrings. An alternative approach is the one based on information theory distances. Biological data information content is stored in very long strings of only four characters. In last ten years, several entropic measures have been proposed for genomic string analysis. Notwithstanding their individual merit and experimental validation, to the nest of our knowledge, there is no direct comparison of these different metrics. We shall present four of the most representative alignment-free distance measures, based on mutual information. Each one has a different origin and expression. Our comparison involves a sort of arrangement, to reduce different concepts to a unique formalism, so as it has been possible to construct a phylogenetic tree for each of them. The trees produced via these metrics are compared to the ones widely accepted as biologically validated. In general the results provided more evidence of the reliability of the alignment-free distance models. Also, we observe that one of the metrics appeared to be more robust than the other three. We believe that this result can be object of further researches and observations. Many of the results of experimentation, the graphics and the table are available at the following URL: http://people.na.infn.it/similar to wbalzano/BIO.
引用
收藏
页码:292 / +
页数:3
相关论文
共 50 条
  • [41] Need for theory-based methods to test theory-based questions - Reply
    Mathias, JL
    Nettelbeck, T
    Willson, RJ
    [J]. RESEARCH IN DEVELOPMENTAL DISABILITIES, 1996, 17 (02) : 153 - 160
  • [42] VIBRATING STRINGS AND INFORMATION-THEORY
    BELL, DA
    [J]. JOURNAL OF SOUND AND VIBRATION, 1972, 25 (04) : 637 - &
  • [43] Theory-based induction
    Kemp, C
    Tenenbaum, JB
    [J]. PROCEEDINGS OF THE TWENTY-FIFTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, PTS 1 AND 2, 2003, : 658 - 663
  • [44] Theory-Based Practice
    Marshak, Robert J.
    [J]. JOURNAL OF APPLIED BEHAVIORAL SCIENCE, 2020, 56 (02): : 140 - 142
  • [45] THEORY-BASED NEUROREHABILITATION
    BACHYRITA, P
    [J]. ARCHIVES OF PHYSICAL MEDICINE AND REHABILITATION, 1989, 70 (02): : 162 - 162
  • [46] VIBRATING STRINGS AND INFORMATION-THEORY
    BARRETT, TW
    [J]. JOURNAL OF SOUND AND VIBRATION, 1972, 20 (03) : 407 - +
  • [47] The Utility Analysis as an Instrument for theory-based Alternative Evaluation
    Woock, K.
    Mindermann, N.
    Voeltzer, L.
    Nordholt, P.
    Busch, S.
    [J]. GESUNDHEITSWESEN, 2021, 83 (08/09) : 763 - 763
  • [48] Development and Analysis of a Spiral Theory-based Cybersecurity Curriculum
    Basu, Debarati
    Barnette, N. Dwight
    Back, Godmar
    McPherson, David
    Naciri, William M.
    Plassmann, Paul E.
    Ribbens, Calvin J.
    Lohani, Vinod K.
    Ellis, Margaret
    Gantt, Kira R.
    [J]. SIGCSE'18: PROCEEDINGS OF THE 49TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, 2018, : 1083 - 1083
  • [49] An information theory-based approach for quantitative evaluation of user interface complexity
    Kang, HG
    Seong, PH
    [J]. IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 1998, 45 (06) : 3165 - 3174
  • [50] Treatment motivation in drug users: A theory-based analysis
    Longshore, D
    Teruya, C
    [J]. DRUG AND ALCOHOL DEPENDENCE, 2006, 81 (02) : 179 - 188