NmSEER V2.0: a prediction tool for 2′-O-methylation sites based on random forest and multi-encoding combination

被引:20
|
作者
Zhou, Yiran [1 ]
Cui, Qinghua [1 ,2 ]
Zhou, Yuan [1 ]
机构
[1] Peking Univ, MOE Key Lab Cardiovasc Sci, Ctr Noncoding RNA Med,Sch Basic Med Sci, Dept Physiol & Pathophysiol,Dept Biomed Informat, 38 Xueyuan Rd, Beijing 100191, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Life Sci & Technol, Minist Educ, Ctr Bioinformat,Key Lab Neuroinformat, Chengdu 610054, Peoples R China
基金
中国国家自然科学基金;
关键词
2 '-O-methylation; Nm site; Random forest; RNA modification; Functional site prediction; PROTEIN-PROTEIN INTERACTIONS; RNA 2'-O-METHYLATION; MUTATIONS; EVOLUTION; DATABASE;
D O I
10.1186/s12859-019-3265-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: 2'-O-methylation (2'-O-me or Nm) is a post-transcriptional RNA methylation modified at 2'-hydroxy, which is common in mRNAs and various non-coding RNAs. Previous studies revealed the significance of Nm in multiple biological processes. With Nm getting more and more attention, a revolutionary technique termed Nm-seq, was developed to profile Nm sites mainly in mRNA with single nucleotide resolution and high sensitivity. In a recent work, supported by the Nm-seq data, we have reported a method in silico for predicting Nm sites, which relies on nucleotide sequence information, and established an online server named NmSEER. More recently, a more confident dataset produced by refined Nm-seq was available. Therefore, in this work, we redesigned the prediction model to achieve a more robust performance on the new data. Results: We redesigned the prediction model from two perspectives, including machine learning algorithm and multi-encoding scheme combination. With optimization by 5-fold cross-validation tests and evaluation by independent test respectively, random forest was selected as the most robust algorithm. Meanwhile, one-hot encoding, together with position-specific dinucleotide sequence profile and K-nucleotide frequency encoding were collectively applied to build the final predictor. Conclusions: The predictor of updated version, named NmSEER V2.0, achieves an accurate prediction performance (AUROC = 0.862) and has been settled into a brand-new server, which is available at http://www.rnanut.net/nmseerv2/ for free.
引用
收藏
页数:9
相关论文
共 4 条
  • [1] NmSEER V2.0: a prediction tool for 2′-O-methylation sites based on random forest and multi-encoding combination
    Yiran Zhou
    Qinghua Cui
    Yuan Zhou
    BMC Bioinformatics, 20
  • [2] NmSEER: A Prediction Tool for 2′-O-Methylation (Nm) Sites Based on Random Forest
    Zhou, Yiran
    Cui, Qinghua
    Zhou, Yuan
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT I, 2018, 10954 : 893 - 900
  • [3] Nmix: a hybrid deep learning model for precise prediction of 2'-O-methylation sites based on multi-feature fusion and ensemble learning
    Geng, Yu-Qing
    Lai, Fei-Liao
    Luo, Hao
    Gao, Feng
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [4] Meta-2OM: A multi-classifier meta-model for the accurate prediction of RNA 2′-O-methylation sites in human RNA
    Harun-Or-Roshid, Md.
    Pham, Nhat Truong
    Manavalan, Balachandran
    Kurata, Hiroyuki
    PLOS ONE, 2024, 19 (06):