Augmenting bacterial similarity measures using a graph-based genome representation

被引:0
|
作者
Ramanan, Vivek [1 ,2 ]
Sarkar, Indra Neil [1 ,2 ,3 ]
机构
[1] Brown Univ, Ctr Computat Mol Biol, Providence, RI 02912 USA
[2] Brown Univ, Ctr Biomed Informat, Providence, RI 02912 USA
[3] Rhode Isl Qual Inst, Providence, RI 02908 USA
关键词
synteny; genome analysis; microbiome; 16S RIBOSOMAL-RNA; IDENTIFICATION; PHYLOGENY; CORE;
D O I
10.1128/msystems.00497-24
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Relationships between bacterial taxa are traditionally defined using 16S rRNA nucleotide similarity or average nucleotide identity. Improvements in sequencing technology provide additional pairwise information on genome sequences, which may provide valuable information on genomic relationships. Mapping orthologous gene locations between genome pairs, known as synteny, is typically implemented in the discovery of new species and has not been systematically applied to bacterial genomes. Using a data set of 378 bacterial genomes, we developed and tested a new measure of synteny similarity between a pair of genomes, which was scaled onto 16S rRNA distance using covariance matrices. Based on the input gene functions used (i.e., core, antibiotic resistance, and virulence), we observed varying topological arrangements of bacterial relationship networks by applying (i) complete linkage hierarchical clustering and (ii) K-nearest neighbor graph structures to synteny-scaled 16S data. Our metric improved clustering quality comparatively to state-of-the-art average nucleotide identity metrics while preserving clustering assignments for the highest similarity relationships. Our findings indicate that syntenic relationships provide more granular and interpretable relationships for within-genera taxa compared to pairwise similarity measures, particularly in functional contexts.IMPORTANCEGiven the prevalence and necessity of the 16S rRNA measure in bacterial identification and analysis, this additional analysis adds a functional and synteny-based layer to the identification of relatives and clustering of bacteria genomes. It is also of computational interest to model the bacterial genome as a graph structure, which presents new avenues of genomic analysis for bacteria and their closely related strains and species. Given the prevalence and necessity of the 16S rRNA measure in bacterial identification and analysis, this additional analysis adds a functional and synteny-based layer to the identification of relatives and clustering of bacteria genomes. It is also of computational interest to model the bacterial genome as a graph structure, which presents new avenues of genomic analysis for bacteria and their closely related strains and species.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] On Similarity Measures for a Graph-Based Recommender System
    Kurt, Zuhal
    Bilge, Alper
    Ozkan, Kemal
    Gerek, Omer Nezih
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2019, 2019, 1078 : 136 - 147
  • [2] Symbolic Music Similarity through a Graph-Based Representation
    Simonetta, Federico
    Carnovalini, Filippo
    Orio, Nicola
    Roda, Antonio
    2018 CONFERENCE ON INTERACTION WITH SOUND (AUDIO MOSTLY): SOUND IN IMMERSION AND EMOTION (AM'18), 2018,
  • [3] Graph-based representation for similarity retrieval of symbolic images
    Hsieh, Shu-Ming
    Hsu, Chiun-Chieh
    DATA & KNOWLEDGE ENGINEERING, 2008, 65 (03) : 401 - 418
  • [4] Graph-based Recommendation Meets Bayes and Similarity Measures
    Lopes, Ramon
    Assuncao, Renato
    Santos, Rodrygo L. T.
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (01)
  • [5] Unsupervised graph-based word sense disambiguation using measures of word semantic similarity
    Sinha, Ravi
    Mihalcea, Rada
    ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 363 - +
  • [6] Impact of Similarity Measures in Graph-based Automatic Text Summarization of Konkani Texts
    D'Silva, Jovi
    Sharma, Uzzal
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (02)
  • [7] Efficient Graph-Based Document Similarity
    Paul, Christian
    Rettinger, Achim
    Mogadala, Aditya
    Knoblock, Craig A.
    Szekely, Pedro
    SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, 2016, 9678 : 334 - 349
  • [8] Graph-Based Shape Similarity of Petroglyphs
    Seidl, Markus
    Wieser, Ewald
    Zeppelzauer, Matthias
    Pinz, Axel
    Breiteneder, Christian
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 133 - 148
  • [9] Graph-based Lemmatization of Turkish Words by Using Morphological Similarity
    Arslan, Enis
    Orhan, Umut
    PROCEEDINGS OF THE 2016 INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA), 2016,
  • [10] GRAPH-BASED KNOWLEDGE REPRESENTATION AND REASONING
    Chein, M.
    ICEIS 2010: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1: DATABASES AND INFORMATION SYSTEMS INTEGRATION, 2010, : IS17 - IS21