Transformer-based embedding applied to classify bacterial species using sequencing reads

被引:0
|
作者
Gwak, Ho-Jin [1 ]
Rho, Mina [1 ]
机构
[1] Hanyang Univ, Dept Comp Sci, Seoul, South Korea
关键词
embedding; transformer; deep learning; classification; Staphylococcus species; SOFTWARE;
D O I
10.1109/BigComp54360.2022.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the emergence of next-generation sequencing and metagenomic approaches, the necessity for read-level taxonomy classifiers has increased. Although the 16S rRNA gene sequence has been widely employed as a taxonomic marker, recent studies have revealed that 16S rRNA is not sufficient to assign species. Therefore, an accurate classifier is required to classify whole-genome sequencing reads into species. With the advancement of deep learning methods and natural language processing technologies, several studies attempted to apply these methods to genomic data and successfully achieved state-of-the-art performance. In this study, we applied transformer- based embedding into bacterial genomes to accurately classify species using sequencing reads. As a case study, we classified Staphylococcus species using sequencing reads. Our model achieved ROC-AVC values of over 0.98 and 0.99 for 151bp and 251bp paired-end reads, respectively. Compared with a cutting-edge method Kraken2, our model classified significantly more S. aureus reads while maintaining comparable precision.
引用
收藏
页码:374 / 377
页数:4
相关论文
共 50 条
  • [1] ASTROMER A transformer-based embedding for the representation of light curves
    Donoso-Oliva, C.
    Becker, I.
    Protopapas, P.
    Cabrera-Vives, G.
    Vishnu, M.
    Vardhan, H.
    ASTRONOMY & ASTROPHYSICS, 2023, 670
  • [2] NEIGHBOR-AUGMENTED TRANSFORMER-BASED EMBEDDING FOR RETRIEVAL
    Zhang, Jihai
    Lin, Fangquan
    Jiang, Wei
    Yang, Cheng
    Liu, Gaoge
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3893 - 3897
  • [3] A Transformer-based Embedding Model for Personalized Product Search
    Bi, Keping
    Ai, Qingyao
    Croft, W. Bruce
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1521 - 1524
  • [4] Multimodal Depression Detection Using Task-oriented Transformer-based Embedding
    Rasipuram, Sowmya
    Bhat, Junaid Hamid
    Maitra, Anutosh
    Shaw, Bishal
    Saha, Sriparna
    2022 27TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2022), 2022,
  • [5] Semanformer: Semantics-aware Embedding Dimensionality Reduction Using Transformer-Based Models
    Boyapati, Mallika
    Aygun, Ramazan
    18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 134 - 141
  • [6] Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
    Ghourabi, Abdallah
    Alohaly, Manar
    SENSORS, 2023, 23 (08)
  • [7] TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance
    Tao, Yue
    Jia, Zhiwei
    Ma, Runze
    Xu, Shugong
    ELECTRONICS, 2021, 10 (22)
  • [8] MEHunter: transformer-based mobile element variant detection from long reads
    Jiang, Tao
    Zhou, Zuji
    Zhang, Zhendong
    Cao, Shuqi
    Wang, Yadong
    Liu, Yadong
    BIOINFORMATICS, 2024, 40 (09)
  • [9] Transformer-Based Word Embedding With CNN Model to Detect Sarcasm and Irony
    Ravinder Ahuja
    S. C. Sharma
    Arabian Journal for Science and Engineering, 2022, 47 : 9379 - 9392
  • [10] A bio-inspired positional embedding network for transformer-based models
    Tang, Xue-song
    Hao, Kuangrong
    Wei, Hui
    NEURAL NETWORKS, 2023, 166 : 204 - 214