Sequence-Based Predicting Bacterial Essential ncRNAs Algorithm by Machine Learning

被引:0
|
作者
Ye, Yuan-Nong [1 ,2 ,3 ]
Liang, Ding-Fa [2 ]
Labena, Abraham Alemayehu [4 ]
Zeng, Zhu [2 ]
机构
[1] Guizhou Med Univ, Sch Big Hlth, Dept Med Informat, Bioinformat & Biomed Big data Min Lab, Guiyang 550025, Peoples R China
[2] Guizhou Med Univ, Cells & Antibody Engn Res Ctr Guizhou Prov, Sch Biol & Engn, Key Lab Biol & Med Engn, Guiyang 550025, Peoples R China
[3] Guizhou Med Univ, Key Lab Environm Pollut Monitoring & Dis Control, Minist Educ, Guiyang 550025, Peoples R China
[4] Dilla Univ, Coll Computat & Nat Sci, Dilla 419, Ethiopia
来源
基金
中国国家自然科学基金;
关键词
Bioinformatics; biological information theory; biomedical informatics; PROTEIN;
D O I
10.32604/iasc.2023.026761
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Essential ncRNA is a type of ncRNA which is indispensable for the sur-vival of organisms. Although essential ncRNAs cannot encode proteins, they are as important as essential coding genes in biology. They have got wide variety of applications such as antimicrobial target discovery, minimal genome construction and evolution analysis. At present, the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming, laborious and costly. In addition, tra-ditional experimental methods are limited by the organisms as less than 1% of bacteria can be cultured in the laboratory. Therefore, it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA. In this paper, we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences. The method was developed with Support Vector Machine (SVM). The accuracy of the method was evaluated through cross-species cross -vali-dation and found to be between 0.69 and 0.81. It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM. Thus, the method can be applied for discovering essential ncRNAs in bacteria.
引用
收藏
页码:2731 / 2741
页数:11
相关论文
共 50 条
  • [41] LIBRUS: combined machine learning and homology information for sequence-based ligand-binding residue prediction
    Kauffman, Chris
    Karypis, George
    BIOINFORMATICS, 2009, 25 (23) : 3099 - 3107
  • [42] CRISPRpred(SEQ): a sequence-based method for sgRNA on target activity prediction using traditional machine learning
    Muhammad Rafid, Ali Haisam
    Toufikuzzaman, Md.
    Rahman, Mohammad Saifur
    Rahman, M. Sohel
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [43] Protein Sequence-Based COVID-19 Detection: A Comparative Study of Machine Learning Classification Methods
    Aminah, Siti
    Ardaneswari, Gianinna
    Awang, Mohd Khalid
    Yusaputra, Muhammad Ariq
    Sari, Dian Puspita
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2024, 2024
  • [44] CRISPRpred(SEQ): a sequence-based method for sgRNA on target activity prediction using traditional machine learning
    Ali Haisam Muhammad Rafid
    Md. Toufikuzzaman
    Mohammad Saifur Rahman
    M. Sohel Rahman
    BMC Bioinformatics, 21
  • [45] Advanced Sequence-Based Localization Algorithm in Wireless Sensor Networks
    Zhou Fubin
    Xue Shaoli
    MATERIAL SCIENCE, CIVIL ENGINEERING AND ARCHITECTURE SCIENCE, MECHANICAL ENGINEERING AND MANUFACTURING TECHNOLOGY II, 2014, 651-653 : 387 - 390
  • [46] Chaotic Sequence-based Video Encryption Algorithm for Network Surveillance
    He, Jie
    Journal of Network Intelligence, 2023, 8 (03): : 676 - 692
  • [47] SeqCP: A sequence-based algorithm for searching circularly permuted proteins
    Chen, Chi -Chun
    Huang, Yu -Wei
    Huang, Hsuan-Cheng
    Lo, Wei-Cheng
    Lyu, Ping -Chiang
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 185 - 201
  • [48] Predicting bacterial essential genes using only sequence composition information
    Ning, L. W.
    Lin, H.
    Ding, H.
    Huang, J.
    Rao, N.
    Guo, F. B.
    GENETICS AND MOLECULAR RESEARCH, 2014, 13 (02): : 4564 - 4572
  • [49] Predicting PTMs through Statistical Moments and Various Sequence-based Features
    Suleman, Muhammad Taseer
    4TH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING (IC)2, 2021, : 839 - 844
  • [50] Exploiting sequence-based features for predicting enhancer- promoter interactions
    Yang, Yang
    Zhang, Ruochi
    Singh, Shashank
    Ma, Jian
    BIOINFORMATICS, 2017, 33 (14) : I252 - I260