Effective ambiguity checking in biosequence analysis

被引:16
|
作者
Reeder, J
Steffen, P
Giegerich, R
机构
[1] Univ Bielefeld, Fac Technol, D-33501 Bielefeld, Germany
[2] Univ Bielefeld, Ctr Biotechnol, Int NRW Grad Sch Bioinformat & Genome Res, D-33501 Bielefeld, Germany
关键词
D O I
10.1186/1471-2105-6-153
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Ambiguity is a problem in biosequence analysis that arises in various analysis tasks solved via dynamic programming, and in particular, in the modeling of families of RNA secondary structures with stochastic context free grammars. Several types of analysis are invalidated by the presence of ambiguity. As this problem inherits undecidability ( as we show here) from the namely problem for context free languages, there is no complete algorithmic solution to the problem of ambiguity checking. Results: We explain frequently observed sources of ambiguity, and show how to avoid them. We suggest four testing procedures that may help to detect ambiguity when present, including a just-in-time test that permits to work safely with a potentially ambiguous grammar. We introduce, for the special case of stochastic context free grammars and RNA structure modeling, an automated partial procedure for proving non-ambiguity. It is used to demonstrate non-ambiguity for several relevant grammars. Conclusion: Our mechanical proof procedure and our testing methods provide a powerful arsenal of methods to ensure non-ambiguity.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Effective ambiguity checking in biosequence analysis
    Janina Reeder
    Peter Steffen
    Robert Giegerich
    [J]. BMC Bioinformatics, 6
  • [2] Biosequence Analysis in PRISM
    Lassen, Ole Torp
    [J]. LOGIC PROGRAMMING, PROCEEDINGS, 2008, 5366 : 809 - 810
  • [3] Nanoarrays for Systolic Biosequence Analysis
    Mehdy, Malik Ashter
    Antidormi, Aleandro
    Graziano, Mariagrazia
    Piccinini, Gianluca
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2018, 27 (12)
  • [4] FOURIER METHODS FOR BIOSEQUENCE ANALYSIS
    BENSON, DC
    [J]. NUCLEIC ACIDS RESEARCH, 1990, 18 (21) : 6305 - 6310
  • [5] Geometric Approach to Biosequence Analysis
    Brimkov, Boris
    Brimkov, Valentin E.
    [J]. 8TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS (PACBB 2014), 2014, 294 : 97 - 104
  • [6] Effective indexing and filtering for similarity search in large biosequence databases
    Ozturk, O
    Ferhatosmanoglu, H
    [J]. THIRD IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING - BIBE 2003, PROCEEDINGS, 2003, : 359 - 366
  • [7] A Machine Learning approach on Latent Semantic Analysis for Ambiguity Checking on Bengali Literature
    Nipu, Ayesha Siddika
    Pal, Urmee
    [J]. 2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [8] Geometric approach to string analysis for biosequence classification
    Brimkov, Boris
    [J]. JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2014, 11 (03): : 252
  • [9] Biosequence Analysis using Intel® Xeon Phi
    Sinha, Pradeep
    Misra, Goldi
    Vikraman, Deepu
    Das, Abhishek
    Desai, Shraddha
    Pawar, Sucheta
    Shewale, Kalyani
    [J]. UKSIM-AMSS SEVENTH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS 2013), 2013, : 497 - 499
  • [10] Evolving Turing machines for biosequence recognition and analysis
    Vallejo, EE
    Ramos, F
    [J]. GENETIC PROGRAMMING, PROCEEDINGS, 2001, 2038 : 192 - 203