Comparative analysis of alignment-free genome clustering and whole genome alignment-based phylogenomic relationship of coronaviruses

被引:5
|
作者
Kirichenko, Anastasiya D. [1 ]
Poroshina, Anastasiya A. [2 ]
Sherbakov, Dmitry Yu [2 ,3 ,4 ]
Sadovsky, Michael G. [5 ,6 ,7 ]
Krutovsky, Konstantin, V [1 ,8 ,9 ,10 ,11 ,12 ]
机构
[1] Siberian Fed Univ, Inst Fundamental Biol & Biotechnol, Dept Genom & Bioinformat, Krasnoyarsk, Russia
[2] Russian Acad Sci, Limnol Inst, Lab Mol Systemat, Siberian Branch, Irkutsk, Russia
[3] Irkutsk State Univ, Fac Biol & Soil Studies, Irkutsk, Russia
[4] Novosibirsk State Univ, Fac Nat Sci, Novosibirsk, Russia
[5] Russian Acad Sci, Inst Computat Modelling, Siberian Branch, Krasnoyarsk, Russia
[6] VF Voino Yasenetsky Krasnoyarsk State Med Univ, Krasnoyarsk, Russia
[7] Fed Med Biol Agcy, Fed Res & Clin Ctr, Krasnoyarsk, Russia
[8] Georg August Univ Gottingen, Dept Forest Genet & Forest Tree Breeding, Gottingen, Germany
[9] Georg August Univ Gottingen, Ctr Integrated Breeding Res, Gottingen, Germany
[10] Siberian Fed Univ, Inst Fundamental Biol & Biotechnol, Genome Res & Educ Ctr, Lab Forest Genom, Krasnoyarsk, Russia
[11] Russian Acad Sci, NI Vavilov Inst Gen Genet, Lab Populat Genet, Moscow, Russia
[12] GF Morozov Voronezh State Univ Forestry & Technol, Sci & Methodol Ctr, Voronezh, Russia
来源
PLOS ONE | 2022年 / 17卷 / 03期
关键词
RECOMBINATION; EVOLUTION;
D O I
10.1371/journal.pone.0264640
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The SARS-CoV-2 is the third coronavirus in addition to SARS-CoV and MERS-CoV that causes severe respiratory syndrome in humans. All of them likely crossed the interspecific barrier between animals and humans and are of zoonotic origin, respectively. The origin and evolution of viruses and their phylogenetic relationships are of great importance for study of their pathogenicity and development of antiviral drugs and vaccines. The main objective of the presented study was to compare two methods for identifying relationships between coronavirus genomes: phylogenetic one based on the whole genome alignment followed by molecular phylogenetic tree inference and alignment-free clustering of triplet frequencies, respectively, using 69 coronavirus genomes selected from two public databases. Both approaches resulted in well-resolved robust classifications. In general, the clusters identified by the first approach were in good agreement with the classes identified by the second using K-means and the elastic map method, but not always, which still needs to be explained. Both approaches demonstrated also a significant divergence of genomes on a taxonomic level, but there was less correspondence between genomes regarding the types of diseases they caused, which may be due to the individual characteristics of the host. This research showed that alignment-free methods are efficient in combination with alignment-based methods. They have a significant advantage in computational complexity and provide valuable additional alternative information on the genomes relationships.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Skmer: assembly-free and alignment-free sample identification using genome skims
    Sarmashghi, Shahab
    Bohmann, Kristine
    Gilbert, M. Thomas P.
    Bafna, Vineet
    Mirarab, Siavash
    GENOME BIOLOGY, 2019, 20 (1)
  • [22] Skmer: assembly-free and alignment-free sample identification using genome skims
    Shahab Sarmashghi
    Kristine Bohmann
    M. Thomas P. Gilbert
    Vineet Bafna
    Siavash Mirarab
    Genome Biology, 20
  • [23] Alignment-free genome analysis of SARS-CoV-2 using Machine learning.
    Randhawa, G. S.
    Soltysiak, M. P. M.
    Roz, H. E. L.
    de Souza, C. P. E.
    Hill, K. A.
    Kari, L.
    ENVIRONMENTAL AND MOLECULAR MUTAGENESIS, 2020, 61 : 55 - 55
  • [24] Using alignment-free and pattern mining methods for SARS-CoV-2 genome analysis
    M. Saqib Nawaz
    Philippe Fournier-Viger
    Memoona Aslam
    Wenjin Li
    Yulin He
    Xinzheng Niu
    Applied Intelligence, 2023, 53 : 21920 - 21943
  • [25] LINflow: a computational pipeline that combines an alignment-free with an alignment-based method to accelerate generation of similarity matrices for prokaryotic genomes
    Tian, Long
    Mazloom, Reza
    Heath, Lenwood S.
    Vinatzer, Boris A.
    PEERJ, 2021, 9
  • [26] A web server for predicting and scanning of IL-5 inducing peptides using alignment-free and alignment-based method
    Naorem, Leimarembi Devi
    Sharma, Neelam
    Raghava, Gajendra P. S.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 158
  • [27] Using alignment-free and pattern mining methods for SARS-CoV-2 genome analysis
    Nawaz, M. Saqib
    Fournier-Viger, Philippe
    Aslam, Memoona
    Li, Wenjin
    He, Yulin
    Niu, Xinzheng
    APPLIED INTELLIGENCE, 2023, 53 (19) : 21920 - 21943
  • [28] Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions
    Sims, Gregory E.
    Jun, Se-Ran
    Wu, Guohong A.
    Kim, Sung-Hou
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (08) : 2677 - 2682
  • [29] Alignment-free genome sequence comparison method based on pair transition difference of k-words
    Han, Gyu-Bum
    Cho, Dong-Ho
    2017 IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL & HEALTH INFORMATICS (BHI), 2017, : 45 - 48
  • [30] CRAFT: Compact genome Representation toward large-scale Alignment-Free daTabase
    Lu, Yang Young
    Bai, Jiaxing
    Wang, Yiwen
    Wang, Ying
    Sun, Fengzhu
    BIOINFORMATICS, 2021, 37 (02) : 155 - 161