MSLP: mRNA subcellular localization predictor based on machine learning techniques

被引:9
|
作者
Musleh, Saleh [1 ]
Islam, Mohammad Tariqul [2 ]
Qureshi, Rizwan [1 ]
Alajez, Nihad [3 ,4 ]
Alam, Tanvir [1 ]
机构
[1] Hamad Bin Khalifa Univ, Coll Sci & Engn, Doha, Qatar
[2] Southern Connecticut State Univ, Comp Sci Dept, New Haven, CT USA
[3] Hamad Bin Khalifa Univ, Qatar Biomed Res Inst QBRI, Translat Canc & Immun Ctr TC, Doha, Qatar
[4] Hamad Bin Khalifa Univ, Coll Hlth & Life Sci, Doha, Qatar
关键词
RNA; mRNA; Machine learning; Sequence analysis; Localization prediction; Subcellular localization; NERVOUS-SYSTEM; RNALOCATE; SEQUENCES; RESOURCE;
D O I
10.1186/s12859-023-05232-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Subcellular localization of messenger RNA (mRNAs) plays a pivotal role in the regulation of gene expression, cell migration as well as in cellular adaptation. Experiment techniques for pinpointing the subcellular localization of mRNAs are laborious, time-consuming and expensive. Therefore, in silico approaches for this purpose are attaining great attention in the RNA community. Methods: In this article, we propose MSLP, a machine learning-based method to predict the subcellular localization of mRNA. We propose a novel combination of four types of features representing k-mer, pseudo k-tuple nucleotide composition (PseKNC), physicochemical properties of nucleotides, and 3D representation of sequences based on Z-curve transformation to feed into machine learning algorithm to predict the subcellular localization of mRNAs. Results: Considering the combination of the above-mentioned features, ennsemble-based models achieved state-of-the-art results in mRNA subcellular localization prediction tasks for multiple benchmark datasets. We evaluated the performance of our method in ten subcellular locations, covering cytoplasm, nucleus, endoplasmic reticulum (ER), extracellular region (ExR), mitochondria, cytosol, pseudopodium, posterior, exosome, and the ribosome. Ablation study highlighted k-mer and PseKNC to be more dominant than other features for predicting cytoplasm, nucleus, and ER localizations. On the other hand, physicochemical properties and Z-curve based features contributed the most to ExR and mitochondria detection. SHAP-based analysis revealed the relative importance of features to provide better insights into the proposed approach. Availability: We have implemented a Docker container and API for end users to run their sequences on our model. Datasets, the code of API and the Docker are shared for the community in GitHub at: https://github.com/smusleh/MSLP.
引用
收藏
页数:23
相关论文
共 50 条
  • [11] Mechanisms of subcellular mRNA localization
    Kloc, M
    Zearfoss, NR
    Etkin, LD
    CELL, 2002, 108 (04) : 533 - 544
  • [12] Subcellular mRNA localization patterns
    Krause, Henry
    NATURE GENETICS, 2007, 39 (11) : 1313 - 1313
  • [13] SnapShot: Subcellular mRNA Localization
    Mofatteh, Mohammad
    Bullock, Simon L.
    CELL, 2017, 169 (01) : 178 - 179
  • [14] mRNA-CLA: An interpretable deep learning approach for predicting mRNA subcellular localization
    Chen, Yifan
    Du, Zhenya
    Ren, Xuanbai
    Pan, Chu
    Zhu, Yangbin
    Li, Zhen
    Meng, Tao
    Yao, Xiaojun
    METHODS, 2024, 227 : 17 - 26
  • [15] MulStack: An ensemble learning prediction model of multilabel mRNA subcellular localization
    Liu Z.
    Bai T.
    Liu B.
    Yu L.
    Computers in Biology and Medicine, 2024, 175
  • [16] Evaluation of machine learning models that predict lncRNA subcellular localization
    Miller, Jason R.
    Yi, Weijun
    Adjeroh, Donald A.
    NAR GENOMICS AND BIOINFORMATICS, 2024, 6 (03)
  • [17] BaCelLo: a balanced subcellular localization predictor
    Pierleoni, Andrea
    Martelli, Pier Luigi
    Fariselli, Piero
    Casadio, Rita
    BIOINFORMATICS, 2006, 22 (14) : E408 - E416
  • [18] MiRLoc: predicting miRNA subcellular localization by incorporating miRNA-mRNA interactions and mRNA subcellular localization
    Xu, Mingmin
    Chen, Yuanyuan
    Xu, Zhihui
    Zhang, Liangyun
    Jiang, Hangjin
    Pian, Cong
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
  • [19] Exploring Machine Learning Techniques for Fault Localization
    Ascari, Luciano C.
    Araki, Lucilia Y.
    Pozo, Aurora R. T.
    Vergilio, Silvia R.
    LATW: 2009 10TH LATIN AMERICAN TEST WORKSHOP, 2009, : 37 - 42
  • [20] Deep Protein Subcellular Localization Predictor Enhanced with Transfer Learning of GO Annotation
    Yuan, Xin
    Pang, Erli
    Lin, Kui
    Hu, Jinglu
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2021, 16 (04) : 559 - 567