iN6-methylat (5-step): identifying DNA N6-methyladenine sites in rice genome using continuous bag of nucleobases via Chou’s 5-step rule

被引:0
|
作者
Nguyen Quoc Khanh Le
机构
[1] Nanyang Technological University,Medical Humanities Research Cluster, School of Humanities
来源
关键词
Skip gram; Continuous bag of words; DNA ; -methyladenine; Support vector machine; FastText; DNA replication;
D O I
暂无
中图分类号
学科分类号
摘要
DNA N6-methyladenine is a non-canonical DNA modification that occurs in different eukaryotes at low levels and it has been identified as an extremely important function of life. Moreover, about 0.2% of adenines are marked by DNA N6-methyladenine in the rice genome, higher than in most of the other species. Therefore, the identification of them has become a very important area of study, especially in biological research. Despite the few computational tools employed to address this problem, there still requires a lot of efforts to improve their performance results. In this study, we treat DNA sequences by the continuous bags of nucleobases, including sub-word information of its biological words, which then serve as features to be fed into a support vector machine algorithm to identify them. Our model which uses this hybrid approach could identify DNA N6-methyladenine sites with achieved a jackknife test sensitivity of 86.48%, specificity of 89.09%, accuracy of 87.78%, and MCC of 0.756. Compared to the state-of-the-art predictor as well as the other methods, our proposed model is able to yield superior performance in all the metrics. Moreover, this study provides a basis for further research that can enrich a field of applying natural language-processing techniques in biological sequences.
引用
收藏
页码:1173 / 1182
页数:9
相关论文
共 25 条
  • [1] iN6-methylat (5-step): identifying DNA N6-methyladenine sites in rice genome using continuous bag of nucleobases via Chou's 5-step rule
    Le, Nguyen Quoc Khanh
    MOLECULAR GENETICS AND GENOMICS, 2019, 294 (05) : 1173 - 1182
  • [2] iDNA6mA (5-step rule): Identification of DNA N6-methyladenine sites in the rice genome by intelligent computational model via Chou's 5-step rule
    Tahir, Muhammad
    Tayara, Hilal
    Chong, Kil To
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 189 : 96 - 101
  • [3] iN6-Methyl (5-step): Identifying RNA N6-methyladenosine sites using deep learning mode via Chou's 5-step rules and Chou's general PseKNC
    Nazari, Iman
    Tahir, Muhammad
    Tayara, Hilal
    Chong, Kil To
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 193
  • [4] csDMA: an improved bioinformatics tool for identifying DNA 6 mA modifications via Chou’s 5-step rule
    Ze Liu
    Wei Dong
    Wei Jiang
    Zili He
    Scientific Reports, 9
  • [5] csDMA: an improved bioinformatics tool for identifying DNA 6 mA modifications via Chou's 5-step rule
    Liu, Ze
    Dong, Wei
    Jiang, Wei
    He, Zili
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [6] Enhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou's 5-step rule and word embedding
    Nguyen Quoc Khanh Le
    Yapp, Edward Kien Yee
    Quang-Thai Ho
    Nagasundaram, N.
    Ou, Yu-Yen
    Yeh, Hui-Yuan
    ANALYTICAL BIOCHEMISTRY, 2019, 571 : 53 - 61
  • [7] Use Chou's 5-Step Rule to Classify Protein Modification Sites with Neural Network
    Song, Chuandong
    Yang, Bin
    SCIENTIFIC PROGRAMMING, 2020, 2020 (2020)
  • [8] Prediction of S-Sulfenylation Sites Using Statistical Moments Based Features via CHOU'S 5-Step Rule
    Butt, Ahmad Hassan
    Khan, Yaser Daanial
    INTERNATIONAL JOURNAL OF PEPTIDE RESEARCH AND THERAPEUTICS, 2020, 26 (03) : 1291 - 1301
  • [9] Prediction of S-Sulfenylation Sites Using Statistical Moments Based Features via CHOU’S 5-Step Rule
    Ahmad Hassan Butt
    Yaser Daanial Khan
    International Journal of Peptide Research and Therapeutics, 2020, 26 : 1291 - 1301
  • [10] Use Chou's 5-Step Rule to Predict DNA-Binding Proteins with Evolutionary Information
    Lu, Weizhong
    Song, Zhengwei
    Ding, Yijie
    Wu, Hongjie
    Cao, Yan
    Zhang, Yu
    Li, Haiou
    BIOMED RESEARCH INTERNATIONAL, 2020, 2020