Transformer-based embedding applied to classify bacterial species using sequencing reads

被引:0
|
作者
Gwak, Ho-Jin [1 ]
Rho, Mina [1 ]
机构
[1] Hanyang Univ, Dept Comp Sci, Seoul, South Korea
关键词
embedding; transformer; deep learning; classification; Staphylococcus species; SOFTWARE;
D O I
10.1109/BigComp54360.2022.00084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the emergence of next-generation sequencing and metagenomic approaches, the necessity for read-level taxonomy classifiers has increased. Although the 16S rRNA gene sequence has been widely employed as a taxonomic marker, recent studies have revealed that 16S rRNA is not sufficient to assign species. Therefore, an accurate classifier is required to classify whole-genome sequencing reads into species. With the advancement of deep learning methods and natural language processing technologies, several studies attempted to apply these methods to genomic data and successfully achieved state-of-the-art performance. In this study, we applied transformer- based embedding into bacterial genomes to accurately classify species using sequencing reads. As a case study, we classified Staphylococcus species using sequencing reads. Our model achieved ROC-AVC values of over 0.98 and 0.99 for 151bp and 251bp paired-end reads, respectively. Compared with a cutting-edge method Kraken2, our model classified significantly more S. aureus reads while maintaining comparable precision.
引用
收藏
页码:374 / 377
页数:4
相关论文
共 50 条
  • [21] HEART: Historically Information Embedding and Subspace Re-Weighting Transformer-Based Tracking
    Liu, Tianpeng
    Li, Jing
    Beheshti, Amin
    Wu, Jia
    Chang, Jun
    Song, Beihang
    Lian, Lezhi
    IEEE TRANSACTIONS ON BIG DATA, 2025, 11 (02) : 566 - 577
  • [22] De-embedding Transformer-based Method for Characterizing the Chip of HF RFID Cards
    Rizkalla, Shrief
    Prestros, Ralph
    Mecklenbraueker, Christoph F.
    2017 IEEE WIRELESS POWER TRANSFER CONFERENCE (WPTC 2017), 2017,
  • [23] TALE: Transformer-based protein function Annotation with joint sequence-Label Embedding
    Cao, Yue
    Shen, Yang
    BIOINFORMATICS, 2021, 37 (18) : 2825 - 2833
  • [24] TRANSFORMER-BASED ESTIMATION OF SPOKEN SENTENCES USING ELECTROCORTICOGRAPHY
    Komeiji, Shuji
    Shigemi, Kai
    Mitsuhashi, Takumi
    Iimura, Yasushi
    Suzuki, Hiroharu
    Sugano, Hidenori
    Shinoda, Koichi
    Tanaka, Toshihisa
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1311 - 1315
  • [25] Predicting the formation of NADES using a transformer-based model
    Ayres, Lucas B.
    Gomez, Federico J. V.
    Silva, Maria Fernanda
    Linton, Jeb R.
    Garcia, Carlos D.
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [26] Transformer-Based Flood Detection Using Multiclass Segmentation
    Park, Joo-Chan
    Kim, Dong-Geon
    Yang, Ji-Ro
    Kang, Kyo-Seok
    2023 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, BIGCOMP, 2023, : 291 - 292
  • [27] Generating Music Transition by Using a Transformer-Based Model
    Hsu, Jia-Lien
    Chang, Shuh-Jiun
    ELECTRONICS, 2021, 10 (18)
  • [28] Arabic Paraphrase Generation Using Transformer-Based Approaches
    Al-Shameri, Noora Aref
    Al-Khalifa, Hend S.
    IEEE ACCESS, 2024, 12 : 121896 - 121914
  • [29] Transformer-Based Deep Learning Strategies for Lithium-Ion Batteries SOX Estimation Using Regular and Inverted Embedding
    Guirguis, John
    Abdulmaksoud, Ahmed
    Ismail, Mohanad
    Kollmeyer, Phillip J.
    Ahmed, Ryan
    IEEE ACCESS, 2024, 12 : 167108 - 167119
  • [30] Using Transformer Based Ensemble Learning to Classify Scientific Articles
    Ghosh, Sohom
    Chopra, Ankush
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, 2021, 12705 : 106 - 113