Sequence-Based Predicting Bacterial Essential ncRNAs Algorithm by Machine Learning

被引:0
|
作者
Ye, Yuan-Nong [1 ,2 ,3 ]
Liang, Ding-Fa [2 ]
Labena, Abraham Alemayehu [4 ]
Zeng, Zhu [2 ]
机构
[1] Guizhou Med Univ, Sch Big Hlth, Dept Med Informat, Bioinformat & Biomed Big data Min Lab, Guiyang 550025, Peoples R China
[2] Guizhou Med Univ, Cells & Antibody Engn Res Ctr Guizhou Prov, Sch Biol & Engn, Key Lab Biol & Med Engn, Guiyang 550025, Peoples R China
[3] Guizhou Med Univ, Key Lab Environm Pollut Monitoring & Dis Control, Minist Educ, Guiyang 550025, Peoples R China
[4] Dilla Univ, Coll Computat & Nat Sci, Dilla 419, Ethiopia
来源
基金
中国国家自然科学基金;
关键词
Bioinformatics; biological information theory; biomedical informatics; PROTEIN;
D O I
10.32604/iasc.2023.026761
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Essential ncRNA is a type of ncRNA which is indispensable for the sur-vival of organisms. Although essential ncRNAs cannot encode proteins, they are as important as essential coding genes in biology. They have got wide variety of applications such as antimicrobial target discovery, minimal genome construction and evolution analysis. At present, the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming, laborious and costly. In addition, tra-ditional experimental methods are limited by the organisms as less than 1% of bacteria can be cultured in the laboratory. Therefore, it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA. In this paper, we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences. The method was developed with Support Vector Machine (SVM). The accuracy of the method was evaluated through cross-species cross -vali-dation and found to be between 0.69 and 0.81. It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM. Thus, the method can be applied for discovering essential ncRNAs in bacteria.
引用
收藏
页码:2731 / 2741
页数:11
相关论文
共 50 条
  • [1] XGEM: Predicting Essential miRNAs by the Ensembles of Various Sequence-Based Classifiers With XGBoost Algorithm
    Min, Hui
    Xin, Xiao-Hong
    Gao, Chu-Qiao
    Wang, Likun
    Du, Pu-Feng
    FRONTIERS IN GENETICS, 2022, 13
  • [2] A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features
    Li, Liqi
    Luo, Qifa
    Xiao, Weidong
    Li, Jinhui
    Zhou, Shiwen
    Li, Yongsheng
    Zheng, Xiaoqi
    Yang, Hua
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2017, 15 (01)
  • [3] Sequence-based machine learning method for predicting the effects of phosphorylation on protein-protein interactions
    Hong, Xiaokun
    Lv, Jiyang
    Li, Zhengxin
    Xiong, Yi
    Zhang, Jian
    Chen, Hai-Feng
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2023, 243
  • [4] Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning
    Wang, Haobo
    Chen, Xuemin
    Li, Can
    Liu, Yuan
    Yang, Fan
    Wang, Chu
    BIOCHEMISTRY, 2018, 57 (04) : 451 - 460
  • [5] Sequence-based analysis and prediction of lantibiotics: A machine learning approach
    Poorinmohammad, Naghmeh
    Hamedi, Javad
    Moghaddam, Mohammad Hossein Abbaspour Motlagh
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2018, 77 : 199 - 206
  • [6] iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators
    Feng, Chao-Qin
    Zhang, Zhao-Yue
    Zhu, Xiao-Juan
    Lin, Yan
    Chen, Wei
    Tang, Hua
    Lin, Hao
    BIOINFORMATICS, 2019, 35 (09) : 1469 - 1477
  • [7] A Novel Sequence-Based Method of Predicting Protein DNA-Binding Residues, Using a Machine Learning Approach
    Cai, Yudong
    He, ZhiSong
    Shi, Xiaohe
    Kong, Xiangying
    Gu, Lei
    Xie, Lu
    MOLECULES AND CELLS, 2010, 30 (02) : 99 - 105
  • [8] Predicting subcellular location of protein with evolution information and sequence-based deep learning
    Liao, Zhijun
    Pan, Gaofeng
    Sun, Chao
    Tang, Jijun
    BMC Bioinformatics, 2021, 22
  • [9] A sequence-based machine learning model for predicting antigenic distance for H3N2 influenza virus
    Li, Xingyi
    Li, Yanyan
    Shang, Xuequn
    Kong, Huihui
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [10] Predicting subcellular location of protein with evolution information and sequence-based deep learning
    Liao, Zhijun
    Pan, Gaofeng
    Sun, Chao
    Tang, Jijun
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 10)