EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction

被引:15
|
作者
Wang, Honglei [1 ,2 ,3 ]
Liu, Hui [1 ,2 ]
Huang, Tao [2 ]
Li, Gangshen [1 ,2 ]
Zhang, Lin [1 ,2 ]
Sun, Yanjing [1 ,2 ]
机构
[1] China Univ Min & Technol, Engn Res Ctr Intelligent Control Underground Spac, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Jiangsu, Peoples R China
[3] Xuzhou Coll Ind Technol, Sch Informat Engn, Xuzhou 221400, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
RNA modification site; Deep learning; Natural language processing; Predictor; N-1-METHYLADENOSINE; N-6-METHYLADENOSINE; LANDSCAPE; RMBASE;
D O I
10.1186/s12859-022-04756-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Recent research recommends that epi-transcriptome regulation through post-transcriptional RNA modifications is essential for all sorts of RNA. Exact identification of RNA modification is vital for understanding their purposes and regulatory mechanisms. However, traditional experimental methods of identifying RNA modification sites are relatively complicated, time-consuming, and laborious. Machine learning approaches have been applied in the procedures of RNA sequence features extraction and classification in a computational way, which may supplement experimental approaches more efficiently. Recently, convolutional neural network (CNN) and long short-term memory (LSTM) have been demonstrated achievements in modification site prediction on account of their powerful functions in representation learning. However, CNN can learn the local response from the spatial data but cannot learn sequential correlations. And LSTM is specialized for sequential modeling and can access both the contextual representation but lacks spatial data extraction compared with CNN. There is strong motivation to construct a prediction framework using natural language processing (NLP), deep learning (DL) for these reasons. Results: This study presents an ensemble multiscale deep learning predictor (EMDLP) to identify RNA methylation sites in an NLP and DL way. It organically combines the dilated convolution and Bidirectional LSTM (BiLSTM), which helps to take better advantage of the local and global information for site prediction. The first step of EMDLP is to represent the RNA sequences in an NLP way. Thus, three encodings, e.g., RNA word embedding, One-hot encoding, and RGloVe, which is an improved learning method of word vector representation based on GloVe, are adopted to decipher sites from the viewpoints of the local and global information. Then, a dilated convolutional Bidirectional LSTM network (DCB) model is constructed with the dilated convolutional neural network (DCNN) followed by BiLSTM to extract potential contributing features for methylation site prediction. Finally, these three encoding methods are integrated by a soft vote to obtain better predictive performance. Experiment results on m(1)A and m(6)A reveal that the area under the receiver operating characteristic(AUROC) of EMDLP obtains respectively 95.56%, 85.24%, and outperforms the state-of-the-art models. To maximize user convenience, a user-friendly webserver for EMDLP was publicly available at http://www.labiip.net/EMDLP/index.php (http://47.104.130.81/EMDLP/index.php). Conclusions: We developed a predictor for m(1)A and m(6)A methylation sites.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] An Ensemble Deep Learning Model for Short-Term Road Surface Temperature Prediction
    Dai, Bingyou
    Yang, Wenchen
    Ji, Xiaofeng
    Zhu, Feng
    Fang, Rui
    Zhou, Linyi
    JOURNAL OF TRANSPORTATION ENGINEERING PART B-PAVEMENTS, 2023, 149 (01)
  • [32] Deep Learning Techniques in the Cancer-Related Medical Domain: A Transfer Deep Learning Ensemble Model for Lung Cancer Prediction
    Jassim, Omar Abdullatif
    Abed, Mohammed Jawad
    Saied, Zenah Hadi
    BAGHDAD SCIENCE JOURNAL, 2024, 21 (03) : 1101 - 1118
  • [33] A deep learning based ensemble learning method for epileptic seizure prediction
    Usman, Syed Muhammad
    Khalid, Shehzad
    Bashir, Sadaf
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 136
  • [34] Ensemble Model with Deep Learning for Melanoma Classification
    Suganthi, N. Mohana
    Arun, M.
    Chitra, A.
    Rajpriya, R.
    Gayathri, B.
    Padmini, B.
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 1541 - 1545
  • [35] Length-Dependent Deep Learning Model for RNA Secondary Structure Prediction
    Mao, Kangkun
    Wang, Jun
    Xiao, Yi
    MOLECULES, 2022, 27 (03):
  • [36] An Ensemble Learning Model for Agricultural Irrigation Prediction
    Chen, Yan-An
    Hsieh, Wen-Hao
    Ko, Yu-Shuo
    Huang, Nen-Fu
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 311 - 316
  • [37] Multiscale spatiotemporal meteorological drought prediction: A deep learning approach
    Zhang, Jia-Li
    Huang, Xiao-Meng
    Sun, Yu-Ze
    ADVANCES IN CLIMATE CHANGE RESEARCH, 2024, 15 (02) : 211 - 221
  • [38] Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding
    Yuan, Qitong
    Chen, Keyi
    Yu, Yimin
    Le, Nguyen Quoc Khanh
    Chua, Matthew Chin Heng
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [39] Marine ecological information prediction by using adjacent location spatiotemporal deep learning model with ensemble learning techniques
    Chang, Yue-Shan
    Huang, Shu-Ting
    Haobijam, Basanta
    Abimannan, Satheesh
    Kushida, Takayuki
    ECOLOGICAL INFORMATICS, 2025, 85
  • [40] RETRACTED: Sports Economic Operation Index Prediction Model Based on Deep Learning and Ensemble Learning (Retracted Article)
    Yang, Chuangjian
    Chen, Junmeng
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022