SAWRPI: A Stacking Ensemble Framework With Adaptive Weight for Predicting ncRNA-Protein Interactions Using Sequence Information

被引:4
|
作者
Ren, Zhong-Hao [1 ]
Yu, Chang-Qing [1 ]
Li, Li-Ping [1 ]
You, Zhu-Hong [2 ]
Guan, Yong-Jian [1 ]
Li, Yue-Chao [1 ]
Pan, Jie [1 ]
机构
[1] Xijing Univ, Sch Informat Engn, Xian, Peoples R China
[2] Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
ncRNA-protein interactions; ncRNA; ensemble learning; sequence analysis; natural language processing; AMINO-ACID-COMPOSITION; LONG NONCODING RNAS; SPECIFICITIES; LNCRNAS;
D O I
10.3389/fgene.2022.839540
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Non-coding RNAs (ncRNAs) take essential effects on biological processes, like gene regulation. One critical way of ncRNA executing biological functions is interactions between ncRNA and RNA binding proteins (RBPs). Identifying proteins, involving ncRNA-protein interactions, can well understand the function ncRNA. Many high-throughput experiment have been applied to recognize the interactions. As a consequence of these approaches are time- and labor-consuming, currently, a great number of computational methods have been developed to improve and advance the ncRNA-protein interactions research. However, these methods may be not available to all RNAs and proteins, particularly processing new RNAs and proteins. Additionally, most of them cannot process well with long sequence. In this work, a computational method SAWRPI is proposed to make prediction of ncRNA-protein through sequence information. More specifically, the raw features of protein and ncRNA are firstly extracted through the k-mer sparse matrix with SVD reduction and learning nucleic acid symbols by natural language processing with local fusion strategy, respectively. Then, to classify easily, Hilbert Transformation is exploited to transform raw feature data to the new feature space. Finally, stacking ensemble strategy is adopted to learn high-level abstraction features automatically and generate final prediction results. To confirm the robustness and stability, three different datasets containing two kinds of interactions are utilized. In comparison with state-of-the-art methods and other results classifying or feature extracting strategies, SAWRPI achieved high performance on three datasets, containing two kinds of lncRNA-protein interactions. Upon our finding, SAWRPI is a trustworthy, robust, yet simple and can be used as a beneficial supplement to the task of predicting ncRNA-protein interactions.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] RPI-SE: a stacking ensemble learning framework for ncRNA-protein interactions prediction using sequence information
    Yi, Hai-Cheng
    You, Zhu-Hong
    Wang, Mei-Neng
    Guo, Zhen-Hao
    Wang, Yan-Bin
    Zhou, Ji-Ren
    [J]. BMC BIOINFORMATICS, 2020, 21 (01)
  • [2] RPI-SE: a stacking ensemble learning framework for ncRNA-protein interactions prediction using sequence information
    Hai-Cheng Yi
    Zhu-Hong You
    Mei-Neng Wang
    Zhen-Hao Guo
    Yan-Bin Wang
    Ji-Ren Zhou
    [J]. BMC Bioinformatics, 21
  • [3] Efficient Framework for Predicting ncRNA-Protein Interactions Based on Sequence Information by Deep Learning
    Zhan, Zhao-Hui
    You, Zhu-Hong
    Zhou, Yong
    Li, Li-Ping
    Li, Zheng-Wei
    [J]. INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT II, 2018, 10955 : 337 - 344
  • [4] RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information
    Suresh, V.
    Liu, Liang
    Adjeroh, Donald
    Zhou, Xiaobo
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (03) : 1370 - 1379
  • [5] Computational Methods for Predicting ncRNA-protein Interactions
    Zhang, Shao-Wu
    Fan, Xiao-Nan
    [J]. MEDICINAL CHEMISTRY, 2017, 13 (06) : 515 - 525
  • [6] A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information
    Yi, Hai-Cheng
    You, Zhu-Hong
    Huang, De-Shuang
    Li, Xiao
    Jiang, Tong-Hai
    Li, Li-Ping
    [J]. MOLECULAR THERAPY-NUCLEIC ACIDS, 2018, 11 : 337 - 344
  • [7] A Stacked Ensemble Learning Framework with Heterogeneous Feature Combinations for Predicting ncRNA-Protein Interaction
    Dai, Qiguo
    Wang, Zhaowei
    Song, Jinmiao
    Duan, Xiaodong
    Guo, Maozu
    Tian, Zhen
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 67 - 71
  • [8] Accurate Prediction of ncRNA-Protein Interactions From the Integration of Sequence and Evolutionary Information
    Zhan, Zhao-Hui
    You, Zhu-Hong
    Li, Li-Ping
    Zhou, Yong
    Yi, Hai-Cheng
    [J]. FRONTIERS IN GENETICS, 2018, 9
  • [9] DM-RPIs: Predicting ncRNA-protein interactions using stacked ensembling strategy
    Cheng, Shuping
    Zhang, Lu
    Tan, Jianjun
    Gong, Weikang
    Li, Chunhua
    Zhang, Xiaoyi
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2019, 83
  • [10] Recent advances on the machine learning methods in predicting ncRNA-protein interactions
    Zhong, Lin
    Zhen, Meiqin
    Sun, Jianqiang
    Zhao, Qi
    [J]. MOLECULAR GENETICS AND GENOMICS, 2021, 296 (02) : 243 - 258