MetaTransformer: deep metagenomic sequencing read classification using self-attention models

被引：6

作者：

Wichmann, Alexander ^{[1
]}

Buschong, Etienne ^{[1
]}

Mueller, Andre ^{[1
]}

Juenger, Daniel ^{[1
]}

Hildebrandt, Andreas ^{[1
]}

Hankeln, Thomas ^{[2
]}

Schmidt, Bertil ^{[1
]}

机构：

[1] Johannes Gutenberg Univ Mainz, Inst Comp Sci, Staudingerweg 9, D-55128 Mainz, Rhineland Palat, Germany

[2] Johannes Gutenberg Univ Mainz, Inst Organ & Mol Evolut iomE, J-J Becher Weg 30A, D-55128 Mainz, Rhineland Palat, Germany

来源：

NAR GENOMICS AND BIOINFORMATICS | 2023年 / 5卷 / 03期

关键词：

MICROBIOME; GENOMES;

D O I：

10.1093/nargab/lqad082

中图分类号：

Q3 [遗传学];

学科分类号：

071007 ; 090102 ;

摘要：

Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2x to 5x speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.

引用

页数：16

共 50 条

[1] Question classification task based on deep learning models with self-attention mechanism
Mondal S.
Barman M.
Nag A.
Multimedia Tools and Applications, 2025, 84 (10) : 7777 - 7806
[2] Deep Learning Approach to Impact Classification in Sensorized Panels Using Self-Attention
Karmakov, Stefan
Aliabadi, M. H. Ferri
SENSORS, 2022, 22 (12)
[3] Deformable Self-Attention for Text Classification
Ma, Qianli
Yan, Jiangyue
Lin, Zhenxi
Yu, Liuhong
Chen, Zipeng
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1570 - 1581
[4] Applying Self-attention for Stance Classification
Bugueno, Margarita
Mendoza, Marcelo
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 51 - 61
[5] Mineral Prospectivity Mapping Using Deep Self-Attention Model
Bojun Yin
Renguang Zuo
Siquan Sun
Natural Resources Research, 2023, 32 : 37 - 56
[6] Mineral Prospectivity Mapping Using Deep Self-Attention Model
Yin, Bojun
Zuo, Renguang
Sun, Siquan
NATURAL RESOURCES RESEARCH, 2023, 32 (01) : 37 - 56
[7] A hybrid self-attention deep learning framework for multivariate sleep stage classification
Yuan, Ye
Jia, Kebin
Ma, Fenglong
Xun, Guangxu
Wang, Yaqing
Su, Lu
Zhang, Aidong
BMC BIOINFORMATICS, 2019, 20 (Suppl 16)
[8] A hybrid self-attention deep learning framework for multivariate sleep stage classification
Ye Yuan
Kebin Jia
Fenglong Ma
Guangxu Xun
Yaqing Wang
Lu Su
Aidong Zhang
BMC Bioinformatics, 20
[9] Kernel Self-Attention for Weakly-supervised Image Classification using Deep Multiple Instance Learning
Rymarczyk, Dawid
Borowa, Adriana
Tabor, Jacek
Zielinski, Bartosz
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1720 - 1729
[10] Exploring Self-Attention for Visual Intersection Classification
Nakata, Haruki
Tanaka, Kanji
Takeda, Koji
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2023, 27 (03) : 386 - 393

← 1 2 3 4 5 →