MetaTransformer: deep metagenomic sequencing read classification using self-attention models

被引:6
|
作者
Wichmann, Alexander [1 ]
Buschong, Etienne [1 ]
Mueller, Andre [1 ]
Juenger, Daniel [1 ]
Hildebrandt, Andreas [1 ]
Hankeln, Thomas [2 ]
Schmidt, Bertil [1 ]
机构
[1] Johannes Gutenberg Univ Mainz, Inst Comp Sci, Staudingerweg 9, D-55128 Mainz, Rhineland Palat, Germany
[2] Johannes Gutenberg Univ Mainz, Inst Organ & Mol Evolut iomE, J-J Becher Weg 30A, D-55128 Mainz, Rhineland Palat, Germany
关键词
MICROBIOME; GENOMES;
D O I
10.1093/nargab/lqad082
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2x to 5x speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Question classification task based on deep learning models with self-attention mechanism
    Mondal S.
    Barman M.
    Nag A.
    Multimedia Tools and Applications, 2025, 84 (10) : 7777 - 7806
  • [2] Deep Learning Approach to Impact Classification in Sensorized Panels Using Self-Attention
    Karmakov, Stefan
    Aliabadi, M. H. Ferri
    SENSORS, 2022, 22 (12)
  • [3] Deformable Self-Attention for Text Classification
    Ma, Qianli
    Yan, Jiangyue
    Lin, Zhenxi
    Yu, Liuhong
    Chen, Zipeng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1570 - 1581
  • [4] Applying Self-attention for Stance Classification
    Bugueno, Margarita
    Mendoza, Marcelo
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 51 - 61
  • [5] Mineral Prospectivity Mapping Using Deep Self-Attention Model
    Bojun Yin
    Renguang Zuo
    Siquan Sun
    Natural Resources Research, 2023, 32 : 37 - 56
  • [6] Mineral Prospectivity Mapping Using Deep Self-Attention Model
    Yin, Bojun
    Zuo, Renguang
    Sun, Siquan
    NATURAL RESOURCES RESEARCH, 2023, 32 (01) : 37 - 56
  • [7] A hybrid self-attention deep learning framework for multivariate sleep stage classification
    Yuan, Ye
    Jia, Kebin
    Ma, Fenglong
    Xun, Guangxu
    Wang, Yaqing
    Su, Lu
    Zhang, Aidong
    BMC BIOINFORMATICS, 2019, 20 (Suppl 16)
  • [8] A hybrid self-attention deep learning framework for multivariate sleep stage classification
    Ye Yuan
    Kebin Jia
    Fenglong Ma
    Guangxu Xun
    Yaqing Wang
    Lu Su
    Aidong Zhang
    BMC Bioinformatics, 20
  • [9] Kernel Self-Attention for Weakly-supervised Image Classification using Deep Multiple Instance Learning
    Rymarczyk, Dawid
    Borowa, Adriana
    Tabor, Jacek
    Zielinski, Bartosz
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1720 - 1729
  • [10] Exploring Self-Attention for Visual Intersection Classification
    Nakata, Haruki
    Tanaka, Kanji
    Takeda, Koji
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2023, 27 (03) : 386 - 393