RNAlight: a machine learning model to identify nucleotide features determining RNA subcellular localization

被引:21
|
作者
Yuan, Guo-Hua [1 ]
Wang, Ying [1 ]
Wang, Guang-Zhong [1 ]
Yang, Li
机构
[1] Chinese Acad Sci, Shanghai Inst Nutr & Hlth, Beijing, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
RNA localization; machine learning; nucleotide feature; motif; RNA binding protein; circular RNA; LONG NONCODING RNAS; MESSENGER-RNA; TRANSCRIPTION; MECHANISMS; REPEATS;
D O I
10.1093/bib/bbac509
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Different RNAs have distinct subcellular localizations. However, nucleotide features that determine these distinct distributions of lncRNAs and mRNAs have yet to be fully addressed. Here, we develop RNAlight, a machine learning model based on LightGBM, to identify nucleotide k-mers contributing to the subcellular localizations of mRNAs and lncRNAs. With the Tree SHAP algorithm, RNAlight extracts nucleotide features for cytoplasmic or nuclear localization of RNAs, indicating the sequence basis for distinct RNA subcellular localizations. By assembling k-mers to sequence features and subsequently mapping to known RBP-associated motifs, different types of sequence features and their associated RBPs were additionally uncovered for lncRNAs and mRNAs with distinct subcellular localizations. Finally, we extended RNAlight to precisely predict the subcellular localizations of other types of RNAs, including snRNAs, snoRNAs and different circular RNA transcripts, suggesting the generality of using RNAlight for RNA subcellular localization prediction.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Machine Learning Approaches Identify Chemical Features for Stage-Specific Antimalarial Compounds
    van Heerden, Ashleigh
    Turon, Gemma
    Duran-Frigola, Miquel
    Pillay, Nelishia
    Birkholtz, Lyn-Marie
    ACS OMEGA, 2023, 8 (46): : 43813 - 43826
  • [42] An efficient machine learning framework to identify important clinical features associated with pulmonary embolism
    Zou, Baiming
    Zou, Fei
    Cai, Jianwen
    PLOS ONE, 2023, 18 (09):
  • [43] Machine learning approaches to identify Parkinson's disease using voice signal features
    Alshammri, Raya
    Alharbi, Ghaida
    Alharbi, Ebtisam
    Almubark, Ibrahim
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [44] Extract Features from Periocular Region to Identify the Age Using Machine Learning Algorithms
    Kishore Kumar Kamarajugadda
    Trinatha Rao Polipalli
    Journal of Medical Systems, 2019, 43
  • [45] Combining Biomechanical Features and Machine Learning Approaches to Identify Fencers' Levels for Training Support
    Aresta, Simona
    Bortone, Ilaria
    Bottiglione, Francesco
    Di Noia, Tommaso
    Di Sciascio, Eugenio
    Lofu, Domenico
    Musci, Mariapia
    Narducci, Fedelucio
    Pazienza, Andrea
    Sardone, Rodolfo
    Sorino, Paolo
    APPLIED SCIENCES-BASEL, 2022, 12 (23):
  • [46] The Application of Machine Learning Algorithms to Diagnose CKD Stages and Identify Critical Metabolites Features
    Feng, Bing
    Zhao, Ying-Yong
    Wang, Jiexi
    Yu, Hui
    Potu, Shiva
    Wang, Jiandong
    Tang, Jijun
    Guo, Yan
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING, IWBBIO 2019, PT I, 2019, 11465 : 72 - 83
  • [47] Text Simplification Tools: Using Machine Learning to Discover Features that Identify Difficult Text
    Kauchak, David
    Mouradi, Obay
    Pentoney, Christopher
    Leroy, Gondy
    2014 47TH HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES (HICSS), 2014, : 2616 - 2625
  • [48] Radiomic Features Combined with Hybrid Machine Learning Robustly Identify Parkinson Disease Subtypes
    Salmanpourpaeenafrakati, M.
    Shamsaei, M.
    Saberi, A.
    Hajianfar, G.
    Soltanian-zadeh, H.
    Rahmim, A.
    MEDICAL PHYSICS, 2020, 47 (06) : E436 - E436
  • [49] MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE METHODS TO IDENTIFY CLINICAL FEATURES PREDICTIVE OF PROGRESSIVE MAFLD
    Salvati, Antonio
    De Rosa, Laura
    Salvati, Nicola
    Faita, Francesco
    Cavallone, Daniela
    Ricco, Gabriele
    Colombatto, Piero
    Coco, Barbara
    Romagnoli, Veronica
    Oliveri, Filippo
    Bonino, Ferruccio
    Brunetto, Maurizia R.
    HEPATOLOGY, 2021, 74 : 963A - 963A
  • [50] Machine learning decodes chemical features to identify novel agonists of a moth odorant receptor
    Caballero-Vidal, Gabriela
    Bouysset, Cedric
    Grunig, Hubert
    Fiorucci, Sebastien
    Montagne, Nicolas
    Golebiowski, Jerome
    Jacquin-Joly, Emmanuelle
    SCIENTIFIC REPORTS, 2020, 10 (01)