Nmix: a hybrid deep learning model for precise prediction of 2'-O-methylation sites based on multi-feature fusion and ensemble learning

被引:0
|
作者
Geng, Yu-Qing [1 ]
Lai, Fei-Liao [1 ]
Luo, Hao [1 ]
Gao, Feng [1 ,2 ,3 ,4 ]
机构
[1] Tianjin Univ, Sch Sci, Dept Phys, 92 Weijin Rd, Tianjin 300072, Peoples R China
[2] Tianjin Univ, Frontiers Sci Ctr Synthet Biol, 92 Weijin Rd, Tianjin 300072, Peoples R China
[3] Tianjin Univ, Key Lab Syst Bioengn, Minist Educ, 92 Weijin Rd, Tianjin 300072, Peoples R China
[4] Collaborat Innovat Ctr Chem Sci & Engn Tianjin, SynBio Res Platform, 92 Weijin Rd, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
2'-O-methylation; multi-feature fusion; deep learning; asymmetric loss; ensemble learning; HIGH-THROUGHPUT; MESSENGER-RNA; IDENTIFICATION; LANDSCAPE;
D O I
10.1093/bib/bbae601
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
RNA 2'-O-methylation (Nm) is a crucial post-transcriptional modification with significant biological implications. However, experimental identification of Nm sites is challenging and resource-intensive. While multiple computational tools have been developed to identify Nm sites, their predictive performance, particularly in terms of precision and generalization capability, remains deficient. We introduced Nmix, an advanced computational tool for precise prediction of Nm sites in human RNA. We constructed the largest, low-redundancy dataset of experimentally verified Nm sites and employed an innovative multi-feature fusion approach, combining one-hot, Z-curve and RNA secondary structure encoding. Nmix utilizes a meticulously designed hybrid deep learning architecture, integrating 1D/2D convolutional neural networks, self-attention mechanism and residual connection. We implemented asymmetric loss function and Bayesian optimization-based ensemble learning, substantially improving predictive performance on imbalanced datasets. Rigorous testing on two benchmark datasets revealed that Nmix significantly outperforms existing state-of-the-art methods across various metrics, particularly in precision, with average improvements of 33.1% and 60.0%, and Matthews correlation coefficient, with average improvements of 24.7% and 51.1%. Notably, Nmix demonstrated exceptional cross-species generalization capability, accurately predicting 93.8% of experimentally verified Nm sites in rat RNA. We also developed a user-friendly web server (https://tubic.org/Nm) and provided standalone prediction scripts to facilitate widespread adoption. We hope that by providing a more accurate and robust tool for Nm site prediction, we can contribute to advancing our understanding of Nm mechanisms and potentially benefit the prediction of other RNA modification sites.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Lung lobe segmentation in computed tomography images based on multi-feature fusion and ensemble learning framework
    Peng, Yuanyuan
    Zhang, Jiaxing
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2023, 33 (06) : 2088 - 2099
  • [32] A Multi-feature Fusion-based Deep Learning for Insulator Image Identification and Fault Detection
    Huang, Xinlei
    Shang, Erbo
    Xue, Jiande
    Ding, Hongwen
    Li, Panpan
    PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1957 - 1960
  • [33] Citation entity recognition method using multi-feature semantic fusion based on deep learning
    Gao, Jie
    Zhang, Zuping
    Cao, Ping
    Huang, Wei
    Li, Fangfang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (06):
  • [34] Crops Fine Classification in Airborne Hyperspectral Imagery Based on Multi-Feature Fusion and Deep Learning
    Wei, Lifei
    Wang, Kun
    Lu, Qikai
    Liang, Yajing
    Li, Haibo
    Wang, Zhengxiang
    Wang, Run
    Cao, Liqin
    REMOTE SENSING, 2021, 13 (15)
  • [35] Research on Railway Dispatcher Fatigue Detection Method Based on Deep Learning with Multi-Feature Fusion
    Chen, Liang
    Zheng, Wei
    ELECTRONICS, 2023, 12 (10)
  • [36] Application of Multi-Feature Fusion Based on Deep Learning in Pedestrian Re-Recognition Method
    Han, Ke
    Zhang, Ning
    Xie, Haoyang
    Wang, Qianlong
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [37] Multi-feature Fusion for Deep Reinforcement Learning: Sequential Control of Mobile Robots
    Wang, Haotian
    Yang, Wenjing
    Huang, Wanrong
    Lin, Zhipeng
    Tang, Yuhua
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT VII, 2018, 11307 : 303 - 315
  • [38] Enhanced deep transfer learning with multi-feature fusion for lung disease detection
    Vidyasri, S.
    Saravanan, S.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023,
  • [39] Enhanced deep transfer learning with multi-feature fusion for lung disease detection
    Vidyasri, S.
    Saravanan, S.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (19) : 56321 - 56345
  • [40] Time series-based hybrid ensemble learning model with multivariate multidimensional feature coding for DNA methylation prediction
    Wu Yan
    Li Tan
    Li Mengshan
    Zhou Weihong
    Sheng Sheng
    Wang Jun
    Wu Fu-an
    BMC Genomics, 24