MSLP: mRNA subcellular localization predictor based on machine learning techniques

被引:9
|
作者
Musleh, Saleh [1 ]
Islam, Mohammad Tariqul [2 ]
Qureshi, Rizwan [1 ]
Alajez, Nihad [3 ,4 ]
Alam, Tanvir [1 ]
机构
[1] Hamad Bin Khalifa Univ, Coll Sci & Engn, Doha, Qatar
[2] Southern Connecticut State Univ, Comp Sci Dept, New Haven, CT USA
[3] Hamad Bin Khalifa Univ, Qatar Biomed Res Inst QBRI, Translat Canc & Immun Ctr TC, Doha, Qatar
[4] Hamad Bin Khalifa Univ, Coll Hlth & Life Sci, Doha, Qatar
关键词
RNA; mRNA; Machine learning; Sequence analysis; Localization prediction; Subcellular localization; NERVOUS-SYSTEM; RNALOCATE; SEQUENCES; RESOURCE;
D O I
10.1186/s12859-023-05232-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Subcellular localization of messenger RNA (mRNAs) plays a pivotal role in the regulation of gene expression, cell migration as well as in cellular adaptation. Experiment techniques for pinpointing the subcellular localization of mRNAs are laborious, time-consuming and expensive. Therefore, in silico approaches for this purpose are attaining great attention in the RNA community. Methods: In this article, we propose MSLP, a machine learning-based method to predict the subcellular localization of mRNA. We propose a novel combination of four types of features representing k-mer, pseudo k-tuple nucleotide composition (PseKNC), physicochemical properties of nucleotides, and 3D representation of sequences based on Z-curve transformation to feed into machine learning algorithm to predict the subcellular localization of mRNAs. Results: Considering the combination of the above-mentioned features, ennsemble-based models achieved state-of-the-art results in mRNA subcellular localization prediction tasks for multiple benchmark datasets. We evaluated the performance of our method in ten subcellular locations, covering cytoplasm, nucleus, endoplasmic reticulum (ER), extracellular region (ExR), mitochondria, cytosol, pseudopodium, posterior, exosome, and the ribosome. Ablation study highlighted k-mer and PseKNC to be more dominant than other features for predicting cytoplasm, nucleus, and ER localizations. On the other hand, physicochemical properties and Z-curve based features contributed the most to ExR and mitochondria detection. SHAP-based analysis revealed the relative importance of features to provide better insights into the proposed approach. Availability: We have implemented a Docker container and API for end users to run their sequences on our model. Datasets, the code of API and the Docker are shared for the community in GitHub at: https://github.com/smusleh/MSLP.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] MSLP: mRNA subcellular localization predictor based on machine learning techniques
    Saleh Musleh
    Mohammad Tariqul Islam
    Rizwan Qureshi
    Nehad M. Alajez
    Tanvir Alam
    BMC Bioinformatics, 24
  • [2] Correction: MSLP: mRNA subcellular localization predictor based on machine learning techniques
    Saleh Musleh
    Mohammad Tariqul Islam
    Rizwan Qureshi
    Nehad M. Alajez
    Tanvir Alam
    BMC Bioinformatics, 24
  • [3] MSLP: mRNA subcellular localization predictor based on machine learning techniques (vol 24, 109, 2023)
    Musleh, Saleh
    Islam, Mohammad Tariqul
    Qureshi, Rizwan
    Alajez, Nehad M. M.
    Alam, Tanvir
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [4] Unified mRNA Subcellular Localization Predictor based on machine learning techniques
    Musleh, Saleh
    Arif, Muhammad
    Alajez, Nehad M.
    Alam, Tanvir
    BMC GENOMICS, 2024, 25 (01):
  • [5] SubLocEP: a novel ensemble predictor of subcellular localization of eukaryotic mRNA based on machine learning
    Li, Jing
    Zhang, Lichao
    He, Shida
    Guo, Fei
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [6] DeepmRNALoc: A Novel Predictor of Eukaryotic mRNA Subcellular Localization Based on Deep Learning
    Wang, Shihang
    Shen, Zhehan
    Liu, Taigang
    Long, Wei
    Jiang, Linhua
    Peng, Sihua
    MOLECULES, 2023, 28 (05):
  • [7] mRNALoc: a novel machine-learning based in-silico tool to predict mRNA subcellular localization
    Garg, Anjali
    Singhal, Neelja
    Kumar, Ravindra
    Kumar, Manish
    NUCLEIC ACIDS RESEARCH, 2020, 48 (W1) : W239 - W243
  • [8] Prediction of subcellular localization of proteins using machine learning techniques and evolutionary information
    Raghava, G. P. S.
    AMINO ACIDS, 2007, 33 (03) : X - XI
  • [9] Extreme Learning Machine Based Bacterial Protein Subcellular Localization Prediction
    Lan, Yuan
    Soh, Yeng Chai
    Huang, Guang-Bin
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1859 - 1863
  • [10] Prediction of Protein Subcellular Localization using Machine Learning
    Upama, Paramita Basak
    Akhter, Shahin
    Bin Asad, Mohammad Imam Hasan
    2018 4TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,