MetaTransformer: deep metagenomic sequencing read classification using self-attention models

被引：6

作者：

Wichmann, Alexander ^{[1
]}

Buschong, Etienne ^{[1
]}

Mueller, Andre ^{[1
]}

Juenger, Daniel ^{[1
]}

Hildebrandt, Andreas ^{[1
]}

Hankeln, Thomas ^{[2
]}

Schmidt, Bertil ^{[1
]}

机构：

[1] Johannes Gutenberg Univ Mainz, Inst Comp Sci, Staudingerweg 9, D-55128 Mainz, Rhineland Palat, Germany

[2] Johannes Gutenberg Univ Mainz, Inst Organ & Mol Evolut iomE, J-J Becher Weg 30A, D-55128 Mainz, Rhineland Palat, Germany

来源：

NAR GENOMICS AND BIOINFORMATICS | 2023年 / 5卷 / 03期

关键词：

MICROBIOME; GENOMES;

D O I：

10.1093/nargab/lqad082

中图分类号：

Q3 [遗传学];

学科分类号：

071007 ; 090102 ;

摘要：

Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2x to 5x speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.

引用

页数：16

共 50 条

[11] Ordinal Depth Classification Using Region-based Self-attention
Phan, Minh Hieu
Phung, Son Lam
Bouzerdoum, Abdesselam
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3620 - 3627
[12] A novel self-attention deep subspace clustering
Chen, Zhengfan
Ding, Shifei
Hou, Haiwei
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (08) : 2377 - 2387
[13] A framework for facial expression recognition using deep self-attention network
Indolia S.
Nigam S.
Singh R.
Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (07) : 9543 - 9562
[14] Deep Semantic Role Labeling with Self-Attention
Tan, Zhixing
Wang, Mingxuan
Xie, Jun
Chen, Yidong
Shi, Xiaodong
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4929 - 4936
[15] A novel self-attention deep subspace clustering
Zhengfan Chen
Shifei Ding
Haiwei Hou
International Journal of Machine Learning and Cybernetics, 2021, 12 : 2377 - 2387
[16] Deep CNNs With Self-Attention for Speaker Identification
Nguyen Nang An
Nguyen Quang Thanh
Liu, Yanbing
IEEE ACCESS, 2019, 7 : 85327 - 85337
[17] Compressed Self-Attention for Deep Metric Learning
Chen, Ziye
Gong, Mingming
Xu, Yanwu
Wang, Chaohui
Zhang, Kun
Du, Bo
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3561 - 3568
[18] Prosodic Structure Prediction using Deep Self-attention Neural Network
Du, Yao
Wu, Zhiyong
Kang, Shiyin
Su, Dan
Yu, Dong
Meng, Helen
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 320 - 324
[19] Deep Multi-Instance Learning with Induced Self-Attention for Medical Image Classification
Li, Zhenliang
Yuan, Liming
Xu, Haixia
Cheng, Rui
Wen, Xianbin
2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 446 - 450
[20] A Study on the Classification of Cancers with Lung Cancer Pathological Images Using Deep Neural Networks and Self-Attention Structures
Kim, Seung Hyun
Kang, Ho Chul
JOURNAL OF POPULATION THERAPEUTICS AND CLINICAL PHARMACOLOGY, 2023, 30 (06): : E374 - E383

← 1 2 3 4 5 →