NCYPred: A Bidirectional LSTM Network With Attention for Y RNA and Short Non-Coding RNA Classification

被引:4
|
作者
Lima, Diego de S. [1 ]
Amichi, Luiz J. A. [2 ]
Fernandez, Maria A. [3 ]
Constantino, Ademir A. [2 ]
V. Seixas, Flavio A. [1 ]
机构
[1] Univ Estadual Maringa, Dept Techonol, BR-87506370 Umuarama, Parana, Brazil
[2] Univ Estadual Maringa, Dept Informat, BR-87020900 Maringa, Parana, Brazil
[3] Univ Estadual Maringa, Dept Biotechnol Genet & Cell Biol, BR-87020900 Maringa, Parana, Brazil
关键词
Non-coding RNA; Y RNA; recurrent neural network; sequence classification; web server; EXPRESSION; MECHANISM;
D O I
10.1109/TCBB.2021.3131136
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Short non-coding RNAs (sncRNAs) are involved in multiple cellular processes and can be divided into dozens of classes. Among such classes, Y RNAs have been gaining attention, being essential factors for the initiation of DNA replication on vertebrates, as well as potential tumor biomarkers. Homologs have also been described in nematodes and insects, as well as related sequences in bacteria. Methods capable of accurately predicting Y RNA transcripts are lacking. In this work, we developed an attention-based LSTM network and built a classification model able to classify sncRNAs (including Y RNA) directly from nucleotide sequences. A dataset consisting of 45,447 sncRNA sequences, from a wide range of organisms, obtained from Rfam 14.3 was built. Performance evaluation demonstrated that our proposed method, NCYPred (Non-Coding/Y RNA Prediction), can accurately predict Y RNA sequences and their homologs, as well as 11 additional classes, achieving results comparable with state-of-the-art methods. We also demonstrate that applying t-SNE on learned sequence representations could be useful for sequence analysis. Our model is freely available as a web-server (https://www.gpea.uem.br/ncypred/).
引用
收藏
页码:557 / 565
页数:9
相关论文
共 50 条
  • [31] A short guide to long non-coding RNA gene nomenclature
    Wright, Mathew W.
    HUMAN GENOMICS, 2014, 8
  • [32] A short guide to long non-coding RNA gene nomenclature
    Mathew W Wright
    Human Genomics, 8
  • [33] Short non-coding RNA sequencing of glioblastoma extracellular vesicles
    Tristan de Mooij
    Timothy E. Peterson
    Jared Evans
    Brandon McCutcheon
    Ian F. Parney
    Journal of Neuro-Oncology, 2020, 146 : 253 - 263
  • [34] Revealing potential long non-coding RNA biomarkers in lung adenocarcinoma using long non-coding RNA-mediated competitive endogenous RNA network
    Zhu, T-G.
    Xiao, X.
    Wei, Q.
    Yue, M.
    Zhang, L-X.
    BRAZILIAN JOURNAL OF MEDICAL AND BIOLOGICAL RESEARCH, 2017, 50 (09)
  • [35] Non-coding RNA genes and the modern RNA world
    Eddy, SR
    NATURE REVIEWS GENETICS, 2001, 2 (12) : 919 - 929
  • [36] Non-coding RNA regulators of RNA polymerase II
    Mariner, Peter
    Walters, Ryan
    Espinoza, Celso
    Wagner, Stacey
    Yakovchuk, Petro
    Goodrich, James
    Kugel, Jennifer
    CANCER RESEARCH, 2009, 69
  • [37] RNA binding proteins as modulators of coding and non-coding RNA pathways
    Meister, G.
    FEBS JOURNAL, 2016, 283 : 29 - 29
  • [38] Non-coding RNA regulators of RNA polymerase II
    Mariner, Peter
    Walters, Ryan
    Espinoza, Celso
    Wagner, Stacey
    Yakovchuk, Petro
    Goodrich, James
    Kugel, Jennifer
    CANCER RESEARCH, 2009, 69
  • [39] Coding non-coding human telomerase RNA
    Naraykina, Y.
    Rubtsova, M.
    Vasilkova, D.
    Meerson, M.
    Zvereva, M.
    Lazarev, V.
    Manuvera, V.
    Kovalchuk, S.
    Anikanov, N.
    Butenko, I.
    Pobeguts, O.
    Govorun, V.
    Dontsova, O.
    FEBS JOURNAL, 2017, 284 : 13 - 13
  • [40] Non-coding RNA and its network in the pathogenesis of Myasthenia Gravis
    Wang, Fuqiang
    Mei, Xiaoli
    Yang, Yunhao
    Zhang, Hanlu
    Li, Zhiyang
    Zhu, Lei
    Deng, Senyi
    Wang, Yun
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2024, 11