EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction

被引:15
|
作者
Wang, Honglei [1 ,2 ,3 ]
Liu, Hui [1 ,2 ]
Huang, Tao [2 ]
Li, Gangshen [1 ,2 ]
Zhang, Lin [1 ,2 ]
Sun, Yanjing [1 ,2 ]
机构
[1] China Univ Min & Technol, Engn Res Ctr Intelligent Control Underground Spac, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Jiangsu, Peoples R China
[3] Xuzhou Coll Ind Technol, Sch Informat Engn, Xuzhou 221400, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
RNA modification site; Deep learning; Natural language processing; Predictor; N-1-METHYLADENOSINE; N-6-METHYLADENOSINE; LANDSCAPE; RMBASE;
D O I
10.1186/s12859-022-04756-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Recent research recommends that epi-transcriptome regulation through post-transcriptional RNA modifications is essential for all sorts of RNA. Exact identification of RNA modification is vital for understanding their purposes and regulatory mechanisms. However, traditional experimental methods of identifying RNA modification sites are relatively complicated, time-consuming, and laborious. Machine learning approaches have been applied in the procedures of RNA sequence features extraction and classification in a computational way, which may supplement experimental approaches more efficiently. Recently, convolutional neural network (CNN) and long short-term memory (LSTM) have been demonstrated achievements in modification site prediction on account of their powerful functions in representation learning. However, CNN can learn the local response from the spatial data but cannot learn sequential correlations. And LSTM is specialized for sequential modeling and can access both the contextual representation but lacks spatial data extraction compared with CNN. There is strong motivation to construct a prediction framework using natural language processing (NLP), deep learning (DL) for these reasons. Results: This study presents an ensemble multiscale deep learning predictor (EMDLP) to identify RNA methylation sites in an NLP and DL way. It organically combines the dilated convolution and Bidirectional LSTM (BiLSTM), which helps to take better advantage of the local and global information for site prediction. The first step of EMDLP is to represent the RNA sequences in an NLP way. Thus, three encodings, e.g., RNA word embedding, One-hot encoding, and RGloVe, which is an improved learning method of word vector representation based on GloVe, are adopted to decipher sites from the viewpoints of the local and global information. Then, a dilated convolutional Bidirectional LSTM network (DCB) model is constructed with the dilated convolutional neural network (DCNN) followed by BiLSTM to extract potential contributing features for methylation site prediction. Finally, these three encoding methods are integrated by a soft vote to obtain better predictive performance. Experiment results on m(1)A and m(6)A reveal that the area under the receiver operating characteristic(AUROC) of EMDLP obtains respectively 95.56%, 85.24%, and outperforms the state-of-the-art models. To maximize user convenience, a user-friendly webserver for EMDLP was publicly available at http://www.labiip.net/EMDLP/index.php (http://47.104.130.81/EMDLP/index.php). Conclusions: We developed a predictor for m(1)A and m(6)A methylation sites.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction
    Honglei Wang
    Hui Liu
    Tao Huang
    Gangshen Li
    Lin Zhang
    Yanjing Sun
    BMC Bioinformatics, 23
  • [2] EnsembleSplice: ensemble deep learning model for splice site prediction
    Victor Akpokiro
    Trevor Martin
    Oluwatosin Oluwadare
    BMC Bioinformatics, 23
  • [3] EnsembleSplice: ensemble deep learning model for splice site prediction
    Akpokiro, Victor
    Martin, Trevor
    Oluwadare, Oluwatosin
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [4] SSMFN: a fused spatial and sequential deep learning model for methylation site prediction
    Lumbanraja, Favorisen Rosyking
    Mahesworo, Bharuno
    Cenggoro, Tjeng Wawan
    Sudigyo, Digdo
    Pardamean, Bens
    PEERJ COMPUTER SCIENCE, 2021, 7 : 1 - 14
  • [5] Ensemble Deep Learning Network Model for Dropout Prediction in MOOCs
    Kumar, Gaurav
    Singh, Amar
    Sharma, Ashok
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (02) : 187 - 196
  • [6] An Ensemble Deep Learning Model for Vehicular Engine Health Prediction
    Joseph Chukwudi, Isinka
    Zaman, Nafees
    Abdur Rahim, Md
    Arafatur Rahman, Md
    Alenazi, Mohammed J. F.
    Pillai, Prashant
    IEEE ACCESS, 2024, 12 : 63433 - 63451
  • [7] Comprehensive hepatotoxicity prediction: ensemble model integrating machine learning and deep learning
    Khan, Muhammad Zafar Irshad
    Ren, Jia-Nan
    Cao, Cheng
    Ye, Hong-Yu-Xiang
    Wang, Hao
    Guo, Ya-Min
    Yang, Jin-Rong
    Chen, Jian-Zhong
    FRONTIERS IN PHARMACOLOGY, 2024, 15
  • [8] Deep multiscale model learning
    Wang, Yating
    Cheung, Siu Wun
    Chung, Eric T.
    Efendiev, Yalchin
    Wang, Min
    JOURNAL OF COMPUTATIONAL PHYSICS, 2020, 406
  • [9] Ensemble Learning Based on Hybrid Deep Learning Model for Heart Disease Early Prediction
    Almulihi, Ahmed
    Saleh, Hager
    Hussien, Ali Mohamed
    Mostafa, Sherif
    El-Sappagh, Shaker
    Alnowaiser, Khaled
    Ali, Abdelmgeid A.
    Refaat Hassan, Moatamad
    DIAGNOSTICS, 2022, 12 (12)
  • [10] A Spatiotemporal Multiscale Deep Learning Model for Subseasonal Prediction of Arctic Sea Ice
    Zheng, Qingyu
    Wang, Ru
    Han, Guijun
    Li, Wei
    Wang, Xuan
    Shao, Qi
    Wu, Xiaobo
    Cao, Lige
    Zhou, Gongfu
    Hu, Song
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 22