Accurate Classification of RNA Structures Using Topological Fingerprints

被引:11
|
作者
Huang, Jiajie [1 ,2 ]
Li, Kejie [1 ,3 ]
Gribskov, Michael [1 ,4 ]
机构
[1] Purdue Univ, Dept Biol Sci, W Lafayette, IN 47907 USA
[2] Thermo Fisher Sci, Life Sci Solut Grp, San Francisco, CA USA
[3] Biogen Idec Inc, Computat Biol Dept, Cambridge, MA USA
[4] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
来源
PLOS ONE | 2016年 / 11卷 / 10期
基金
美国国家科学基金会;
关键词
SECONDARY STRUCTURE PREDICTION; DYNAMIC-PROGRAMMING ALGORITHM; PROTEIN DATA-BANK; INCLUDING PSEUDOKNOTS; NONCODING RNAS; ABSTRACT SHAPES; RIBONUCLEASE-P; RIBOSOMAL-RNA; SEQUENCE; GRAPHS;
D O I
10.1371/journal.pone.0164726
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity-an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Topological classification of RNA structures
    Bon, Michael
    Vernizzi, Graziano
    Orland, Henri
    Zee, A.
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2008, 379 (04) : 900 - 911
  • [2] Topological classification and enumeration of RNA structures by genus
    Andersen, J. E.
    Penner, R. C.
    Reidys, C. M.
    Waterman, M. S.
    [J]. JOURNAL OF MATHEMATICAL BIOLOGY, 2013, 67 (05) : 1261 - 1278
  • [3] Topological classification and enumeration of RNA structures by genus
    J.E. Andersen
    R.C. Penner
    C.M. Reidys
    M.S. Waterman
    [J]. Journal of Mathematical Biology, 2013, 67 : 1261 - 1278
  • [4] Topological Classification of RNA Structures via Intersection Graph
    Quadrini, Michela
    Culmone, Rosario
    Merelli, Emanuela
    [J]. THEORY AND PRACTICE OF NATURAL COMPUTING, TPNC 2017, 2017, 10687 : 203 - 215
  • [5] Statistics of topological RNA structures
    Thomas J. X. Li
    Christian M. Reidys
    [J]. Journal of Mathematical Biology, 2017, 74 : 1793 - 1821
  • [6] Statistics of topological RNA structures
    Li, Thomas J. X.
    Reidys, Christian M.
    [J]. JOURNAL OF MATHEMATICAL BIOLOGY, 2017, 74 (07) : 1793 - 1821
  • [7] Shapes of topological RNA structures
    Huang, Fenix W. D.
    Reidys, Christian M.
    [J]. MATHEMATICAL BIOSCIENCES, 2015, 270 : 57 - 65
  • [8] On Topological RNA Interaction Structures
    Qin, Jing
    Reidys, Christian M.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2013, 20 (07) : 495 - 513
  • [9] Classification analysis of fatty acid synthase inhibitors using multialgorithms on topological descriptors and structural fingerprints
    Singh, Shailendra
    Karthikeyan, Chandrabose
    Moorthy, Narayana Subbiah Hari Narayana
    [J]. CHEMICAL BIOLOGY & DRUG DESIGN, 2023, 101 (02) : 395 - 407
  • [10] Accurate Distance Constraints for RNA Structures Using Deer Spectroscopy
    DeRose, Victoria J.
    Kim, Nak-Kyoon
    Bowman, Michael K.
    Green, Brandon
    Unger, Adam
    Stoll, Stefan
    Britt, R. David
    [J]. BIOPHYSICAL JOURNAL, 2010, 98 (03) : 264A - 264A