Transformer-based embedding applied to classify bacterial species using sequencing reads

被引:0
|
作者
Gwak, Ho-Jin [1 ]
Rho, Mina [1 ]
机构
[1] Hanyang Univ, Dept Comp Sci, Seoul, South Korea
关键词
embedding; transformer; deep learning; classification; Staphylococcus species; SOFTWARE;
D O I
10.1109/BigComp54360.2022.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the emergence of next-generation sequencing and metagenomic approaches, the necessity for read-level taxonomy classifiers has increased. Although the 16S rRNA gene sequence has been widely employed as a taxonomic marker, recent studies have revealed that 16S rRNA is not sufficient to assign species. Therefore, an accurate classifier is required to classify whole-genome sequencing reads into species. With the advancement of deep learning methods and natural language processing technologies, several studies attempted to apply these methods to genomic data and successfully achieved state-of-the-art performance. In this study, we applied transformer- based embedding into bacterial genomes to accurately classify species using sequencing reads. As a case study, we classified Staphylococcus species using sequencing reads. Our model achieved ROC-AVC values of over 0.98 and 0.99 for 151bp and 251bp paired-end reads, respectively. Compared with a cutting-edge method Kraken2, our model classified significantly more S. aureus reads while maintaining comparable precision.
引用
收藏
页码:374 / 377
页数:4
相关论文
共 50 条
  • [41] Molecular Descriptors Property Prediction Using Transformer-Based Approach
    Tran, Tuan
    Ekenna, Chinwe
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (15)
  • [42] Operational prediction of solar flares using a transformer-based framework
    Yasser Abduallah
    Jason T. L. Wang
    Haimin Wang
    Yan Xu
    Scientific Reports, 13
  • [43] Pulsar candidate identification using advanced transformer-based models
    Cao, Jie
    Xu, Tingting
    Deng, Linhua
    Zhou, Xueliang
    Li, Shangxi
    Liu, Yuxia
    Zhou, Weihong
    CHINESE JOURNAL OF PHYSICS, 2024, 90 : 121 - 133
  • [44] Image captioning using transformer-based double attention network
    Parvin, Hashem
    Naghsh-Nilchi, Ahmad Reza
    Mohammadi, Hossein Mahvash
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 125
  • [45] Automatic text summarization using transformer-based language models
    Rao, Ritika
    Sharma, Sourabh
    Malik, Nitin
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (06) : 2599 - 2605
  • [46] Transformer-based Approaches for Personality Detection using the MBTI Model
    Lazo Vasquez, Ricardo
    Ochoa-Luna, Jose
    2021 XLVII LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2021), 2021,
  • [47] Development of a Text Classification Framework using Transformer-based Embeddings
    Yeasmin, Sumona
    Afrin, Nazia
    Saif, Kashfia
    Huq, Mohammad Rezwanul
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2022, : 74 - 82
  • [48] Operational prediction of solar flares using a transformer-based framework
    Abduallah, Yasser
    Wang, Jason T. L.
    Wang, Haimin
    Xu, Yan
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [49] Legal Information Retrieval and Entailment Using Transformer-based Approaches
    Kim, Mi-Young
    Rabelo, Juliano
    Babiker, Housam Khalifa Bashier
    Rahman, Md Abed
    Goebel, Randy
    REVIEW OF SOCIONETWORK STRATEGIES, 2024, 18 (01): : 101 - 121
  • [50] Promises and perils of using Transformer-based models for SE research
    Xiao, Yan
    Zuo, Xinyue
    Lu, Xiaoyue
    Dong, Jin Song
    Cao, Xiaochun
    Beschastnikh, Ivan
    NEURAL NETWORKS, 2025, 184