Using alignment-free and pattern mining methods for SARS-CoV-2 genome analysis

被引:0
|
作者
M. Saqib Nawaz
Philippe Fournier-Viger
Memoona Aslam
Wenjin Li
Yulin He
Xinzheng Niu
机构
[1] Shenzhen University,College of Computer Science and Software Engineering
[2] Shenzhen University,Institute for Advanced Study
[3] Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ),School of Computer Science and Engineering
[4] University of Electronic Science and Technology of China,undefined
来源
Applied Intelligence | 2023年 / 53卷
关键词
COVID-19; SARS-CoV-2; Genome sequence; Amino acids; Alignment-free; Sequential pattern mining; Mutation;
D O I
暂无
中图分类号
学科分类号
摘要
Examining the genome sequences of the SARS-CoV-2 virus, that causes the respiratory disease known as coronavirus disease 2019 (COVID-19), play important role in the proper understanding of this virus, its main characteristics and functionalities. This paper investigates the use of alignment-free (AF) sequence analysis and sequential pattern mining (SPM) to analyze SARS-CoV-2 genome sequences and learn interesting information about them respectively. AF methods are used to find (dis)similarity in the genome sequences of SARS-CoV-2 by using various distance measures, to compare the performance of these measures and to construct the phylogenetic trees. SPM algorithms are used to discover frequent amino acid patterns and their relationship with each other and to predict the amino acid(s) by using various sequence-based prediction models. In last, an algorithm is proposed to analyze mutation in genome sequences. The algorithm finds the locations for changed amino acid(s) in the genome sequences and computes the mutation rate. From obtained results, it is found that that both AF and SPM methods can be used to discover interesting information/patterns in SARS-CoV-2 genome sequences for examining the variations and evolution among strains.
引用
收藏
页码:21920 / 21943
页数:23
相关论文
共 50 条
  • [31] A framework for Alignment-free methods to perform similarity analysis of biological sequence
    Gupta, Manoj Kumar
    Niyogi, Rajdeep
    Misra, Manoj
    2013 SIXTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2013, : 337 - 342
  • [32] Using alignment-free methods as preprocessing stage to classification whole genomes
    Shanan, Najah Abed Alhadi
    Lafta, Hussein Attya
    Al-Rashid, Sura Z.
    INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (02): : 1531 - 1539
  • [33] Engineering Genome-Free Bacterial Cells for Effective SARS-COV-2 Neutralisation
    Yin, Yutong
    Liu, Chang
    Ji, Xianglin
    Wang, Yun
    Mongkolsapaya, Juthathip
    Screaton, Gavin R.
    Cui, Zhanfeng
    Huang, Wei E.
    MICROBIAL BIOTECHNOLOGY, 2025, 18 (03):
  • [34] A Comparison of SARS-CoV-2 Genome Sequencing Methods for Surveillance in Rural New England
    Deharvengt, S.
    Green, D.
    Winnick, K.
    Sathyanarayana, S.
    Shannon, P.
    Kelly, M.
    Sevigny, J.
    Thomas, K.
    Tsongalis, G.
    Whitfield, M.
    Lefferts, J.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2022, 24 (10): : S133 - S133
  • [35] The controversy of SARS-CoV-2 integration into the human genome
    AL-Eitan, Laith
    Mihyar, Ahmad
    REVIEWS IN MEDICAL VIROLOGY, 2024, 34 (01)
  • [36] Genome composition and genetic characterization of SARS-CoV-2
    Al-Qaaneh, Ayman M.
    Alshammari, Thamer
    Aldahhan, Razan
    Aldossary, Hanan
    Alkhalifah, Zahra Abduljaleel
    Borgio, J. Francis
    SAUDI JOURNAL OF BIOLOGICAL SCIENCES, 2021, 28 (03) : 1978 - 1989
  • [37] The lag in SARS-CoV-2 genome submissions to GISAID
    Kalia, Kishan
    Saberwal, Gayatri
    Sharma, Gaurav
    NATURE BIOTECHNOLOGY, 2021, 39 (09) : 1058 - 1060
  • [38] Genome-wide covariation in SARS-CoV-2
    Cresswell-Clay, Evan
    Periwal, Vipul
    MATHEMATICAL BIOSCIENCES, 2021, 341
  • [39] The lag in SARS-CoV-2 genome submissions to GISAID
    Kishan Kalia
    Gayatri Saberwal
    Gaurav Sharma
    Nature Biotechnology, 2021, 39 : 1058 - 1060
  • [40] Data stream dataset of SARS-CoV-2 genome
    Barbosa, Raquel de M.
    Fernandes, Marcelo A. C.
    DATA IN BRIEF, 2020, 31