RNAlight: a machine learning model to identify nucleotide features determining RNA subcellular localization

被引:21
|
作者
Yuan, Guo-Hua [1 ]
Wang, Ying [1 ]
Wang, Guang-Zhong [1 ]
Yang, Li
机构
[1] Chinese Acad Sci, Shanghai Inst Nutr & Hlth, Beijing, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
RNA localization; machine learning; nucleotide feature; motif; RNA binding protein; circular RNA; LONG NONCODING RNAS; MESSENGER-RNA; TRANSCRIPTION; MECHANISMS; REPEATS;
D O I
10.1093/bib/bbac509
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Different RNAs have distinct subcellular localizations. However, nucleotide features that determine these distinct distributions of lncRNAs and mRNAs have yet to be fully addressed. Here, we develop RNAlight, a machine learning model based on LightGBM, to identify nucleotide k-mers contributing to the subcellular localizations of mRNAs and lncRNAs. With the Tree SHAP algorithm, RNAlight extracts nucleotide features for cytoplasmic or nuclear localization of RNAs, indicating the sequence basis for distinct RNA subcellular localizations. By assembling k-mers to sequence features and subsequently mapping to known RBP-associated motifs, different types of sequence features and their associated RBPs were additionally uncovered for lncRNAs and mRNAs with distinct subcellular localizations. Finally, we extended RNAlight to precisely predict the subcellular localizations of other types of RNAs, including snRNAs, snoRNAs and different circular RNA transcripts, suggesting the generality of using RNAlight for RNA subcellular localization prediction.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] MulStack: An ensemble learning prediction model of multilabel mRNA subcellular localization
    Liu Z.
    Bai T.
    Liu B.
    Yu L.
    Computers in Biology and Medicine, 2024, 175
  • [22] A Transfer Learning Model for Unbalanced Archaeal Bacterial Protein Subcellular Localization
    Chen, Haowen
    Huang, Lei
    Huang, Hao
    Liao, Bo
    Cao, Zhi
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2014, 11 (07) : 1579 - 1584
  • [23] Predicting the Subcellular Localization of Human Proteins Using Machine Learning and Exploratory Data Analysis
    George K. Acquaah-Mensah
    Sonia M. Leach
    Chittibabu Guda
    Genomics Proteomics & Bioinformatics, 2006, (02) : 120 - 133
  • [24] Improving the prediction of disulfide bonds in Eukaryotes with machine learning methods and protein subcellular localization
    Savojardo, Castrense
    Fariselli, Piero
    Alhamdoosh, Monther
    Martelli, Pier Luigi
    Pierleoni, Andrea
    Casadio, Rita
    BIOINFORMATICS, 2011, 27 (16) : 2224 - 2230
  • [25] SubLocEP: a novel ensemble predictor of subcellular localization of eukaryotic mRNA based on machine learning
    Li, Jing
    Zhang, Lichao
    He, Shida
    Guo, Fei
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [26] Machine learning localization to identify the epileptogenic side in mesial temporal lobe epilepsy
    Yu, Hsiang-Yu
    Tsai, Cheng Jui
    Lee, Tse-Hao
    Tung, Hsin
    Shih, Yen-Cheng
    Chou, Chien-Chen
    Lee, Cheng-Chia
    Lin, Po-Tso
    Peng, Syu-Jyun
    MAGNETIC RESONANCE IMAGING, 2025, 115
  • [28] Machine Learning on Merging Static and Dynamic Features to Identify Malicious Mobile Apps
    Su, Ming-Yang
    Chang, Jer-Yuan
    Fung, Kek-Tung
    2017 NINTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN 2017), 2017, : 863 - 867
  • [29] Automatic localization and annotation of facial features using machine learning techniques
    Paul C. Conilione
    Dianhui Wang
    Soft Computing, 2011, 15 : 1231 - 1245
  • [30] Automatic localization and annotation of facial features using machine learning techniques
    Conilione, Paul C.
    Wang, Dianhui
    SOFT COMPUTING, 2011, 15 (06) : 1231 - 1245