Sequence-based heuristics for faster annotation of non-coding RNA families

被引:60
|
作者
Weinberg, Z [1 ]
Ruzzo, WL
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
关键词
D O I
10.1093/bioinformatics/bti743
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. Results: In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that-unlike family-specific solutions-can scale to hundreds of ncRNA families.
引用
收藏
页码:35 / 39
页数:5
相关论文
共 50 条
  • [1] Evolutionary and sequence-based relationships in bacterial AdoMet-dependent non-coding RNA methyltransferases
    Mosquera-Rendón J.
    Cárdenas-Brito S.
    Pineda J.D.
    Corredor M.
    Benítez-Páez A.
    [J]. BMC Research Notes, 7 (1)
  • [2] Sequence-Based Analysis Uncovers an Abundance of Non-Coding RNA in the Total Transcriptome of Mycobacterium tuberculosis
    Arnvig, Kristine B.
    Comas, Inaki
    Thomson, Nicholas R.
    Houghton, Joanna
    Boshoff, Helena I.
    Croucher, Nicholas J.
    Rose, Graham
    Perkins, Timothy T.
    Parkhill, Julian
    Dougan, Gordon
    Young, Douglas B.
    [J]. PLOS PATHOGENS, 2011, 7 (11)
  • [3] Functional Annotation of Coding and Non-Coding RNA in Non-Model Organisms
    Shome, Sayane
    Jernigan, Robert L.
    [J]. BIOPHYSICAL JOURNAL, 2020, 118 (03) : 461A - 461A
  • [4] Non-coding RNA annotation of the genome of Trichoplax adhaerens
    Hertel, Jana
    de Jong, Danielle
    Marz, Manja
    Rose, Dominic
    Tafer, Hakim
    Tanzer, Andrea
    Schierwater, Bernd
    Stadler, Peter F.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 (05) : 1602 - 1615
  • [5] Definition and annotation of (myco)bacterial non-coding RNA
    Lamichhane, Gyanu
    Arnvig, Kristine B.
    McDonough, Kathleen A.
    [J]. TUBERCULOSIS, 2013, 93 (01) : 26 - 29
  • [6] Non-coding transcription characterization and annotation A guide and web resource for non-coding RNA databases
    Paschoal, Alexandre Rossi
    Maracaja-Coutinho, Vinicius
    Setubal, Joao Carlos
    Paulino Simoes, Zila Luz
    Verjovski-Almeida, Sergio
    Durham, Alan Mitchell
    [J]. RNA BIOLOGY, 2012, 9 (03) : 274 - 282
  • [7] Non-coding RNA annotation:Deciphering the second genetic code
    QU LiangHu
    [J]. Science China(Life Sciences)., 2013, 56 (10) - 866
  • [8] Non-coding RNA annotation: Deciphering the second genetic code
    Qu LiangHu
    [J]. SCIENCE CHINA-LIFE SCIENCES, 2013, 56 (10) : 865 - 866
  • [9] Non-coding RNA annotation: Deciphering the second genetic code
    Qu LiangHu
    [J]. Science China Life Sciences, 2013, 56 : 865 - 866
  • [10] Non-coding RNA annotation: Deciphering the second genetic code
    QU LiangHu
    [J]. Science China(Life Sciences), 2013, (10) : 865 - 866