Nmix: a hybrid deep learning model for precise prediction of 2'-O-methylation sites based on multi-feature fusion and ensemble learning

被引:0
|
作者
Geng, Yu-Qing [1 ]
Lai, Fei-Liao [1 ]
Luo, Hao [1 ]
Gao, Feng [1 ,2 ,3 ,4 ]
机构
[1] Tianjin Univ, Sch Sci, Dept Phys, 92 Weijin Rd, Tianjin 300072, Peoples R China
[2] Tianjin Univ, Frontiers Sci Ctr Synthet Biol, 92 Weijin Rd, Tianjin 300072, Peoples R China
[3] Tianjin Univ, Key Lab Syst Bioengn, Minist Educ, 92 Weijin Rd, Tianjin 300072, Peoples R China
[4] Collaborat Innovat Ctr Chem Sci & Engn Tianjin, SynBio Res Platform, 92 Weijin Rd, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
2'-O-methylation; multi-feature fusion; deep learning; asymmetric loss; ensemble learning; HIGH-THROUGHPUT; MESSENGER-RNA; IDENTIFICATION; LANDSCAPE;
D O I
10.1093/bib/bbae601
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
RNA 2'-O-methylation (Nm) is a crucial post-transcriptional modification with significant biological implications. However, experimental identification of Nm sites is challenging and resource-intensive. While multiple computational tools have been developed to identify Nm sites, their predictive performance, particularly in terms of precision and generalization capability, remains deficient. We introduced Nmix, an advanced computational tool for precise prediction of Nm sites in human RNA. We constructed the largest, low-redundancy dataset of experimentally verified Nm sites and employed an innovative multi-feature fusion approach, combining one-hot, Z-curve and RNA secondary structure encoding. Nmix utilizes a meticulously designed hybrid deep learning architecture, integrating 1D/2D convolutional neural networks, self-attention mechanism and residual connection. We implemented asymmetric loss function and Bayesian optimization-based ensemble learning, substantially improving predictive performance on imbalanced datasets. Rigorous testing on two benchmark datasets revealed that Nmix significantly outperforms existing state-of-the-art methods across various metrics, particularly in precision, with average improvements of 33.1% and 60.0%, and Matthews correlation coefficient, with average improvements of 24.7% and 51.1%. Notably, Nmix demonstrated exceptional cross-species generalization capability, accurately predicting 93.8% of experimentally verified Nm sites in rat RNA. We also developed a user-friendly web server (https://tubic.org/Nm) and provided standalone prediction scripts to facilitate widespread adoption. We hope that by providing a more accurate and robust tool for Nm site prediction, we can contribute to advancing our understanding of Nm mechanisms and potentially benefit the prediction of other RNA modification sites.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] MuSE: A deep learning model based on multi-feature fusion for super-enhancer prediction
    He, Wenying
    Zhou, Haolu
    Zuo, Yun
    Bai, Yude
    Guo, Fei
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 113
  • [2] Multi-feature fusion for specific emitter identification via deep ensemble learning
    Liu, Zhang-Meng
    DIGITAL SIGNAL PROCESSING, 2021, 110
  • [3] Chinese stock trend prediction based on multi-feature learning and model fusion
    Lai, Shanyan
    Ye, Chunyang
    Zhou, Hongyu Jiang Hui
    2021 IEEE INTERNATIONAL CONFERENCE ON SMART DATA SERVICES (SMDS 2021), 2021, : 18 - 23
  • [4] A Deep Learning Approach Based on Novel Multi-Feature Fusion for Power Load Prediction
    Xiao, Ling
    An, Ruofan
    Zhang, Xue
    PROCESSES, 2024, 12 (04)
  • [5] Machine learning algorithm for precise prediction of 2′-O-methylation (Nm) sites from experimental RiboMethSeq datasets
    Pichot, Florian
    Marchand, Virginie
    Helm, Mark
    Motorin, Yuri
    METHODS, 2022, 203 : 311 - 321
  • [6] Seal Recognition and Application Based on Multi-feature Fusion Deep Learning
    Zhang Z.
    Xia S.
    Liu Z.
    Data Analysis and Knowledge Discovery, 2024, 8 (03) : 143 - 155
  • [7] Public Opinion Early Warning Agent Model: A Deep Learning Cascade Virality Prediction Model Based on Multi-Feature Fusion
    Gao, Liqun
    Liu, Yujia
    Zhuang, Hongwu
    Wang, Haiyang
    Zhou, Bin
    Li, Aiping
    FRONTIERS IN NEUROROBOTICS, 2021, 15
  • [8] Deep learning model with multi-feature fusion and label association for suicide detection
    Li, Zepeng
    Cheng, Wenchuan
    Zhou, Jiawei
    An, Zhengyi
    Hu, Bin
    MULTIMEDIA SYSTEMS, 2023, 29 (04) : 2193 - 2203
  • [9] H2Opred: a robust and efficient hybrid deep learning model for predicting 2'-O-methylation sites in human RNA
    Pham, Nhat Truong
    Rakkiyapan, Rajan
    Park, Jongsun
    Malik, Adeel
    Manavalan, Balachandran
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (01)
  • [10] Deep learning model with multi-feature fusion and label association for suicide detection
    Zepeng Li
    Wenchuan Cheng
    Jiawei Zhou
    Zhengyi An
    Bin Hu
    Multimedia Systems, 2023, 29 : 2193 - 2203