MetaTransformer: deep metagenomic sequencing read classification using self-attention models

被引:6
|
作者
Wichmann, Alexander [1 ]
Buschong, Etienne [1 ]
Mueller, Andre [1 ]
Juenger, Daniel [1 ]
Hildebrandt, Andreas [1 ]
Hankeln, Thomas [2 ]
Schmidt, Bertil [1 ]
机构
[1] Johannes Gutenberg Univ Mainz, Inst Comp Sci, Staudingerweg 9, D-55128 Mainz, Rhineland Palat, Germany
[2] Johannes Gutenberg Univ Mainz, Inst Organ & Mol Evolut iomE, J-J Becher Weg 30A, D-55128 Mainz, Rhineland Palat, Germany
关键词
MICROBIOME; GENOMES;
D O I
10.1093/nargab/lqad082
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2x to 5x speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.
引用
收藏
页数:16
相关论文
共 50 条
  • [11] Ordinal Depth Classification Using Region-based Self-attention
    Phan, Minh Hieu
    Phung, Son Lam
    Bouzerdoum, Abdesselam
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3620 - 3627
  • [12] A novel self-attention deep subspace clustering
    Chen, Zhengfan
    Ding, Shifei
    Hou, Haiwei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (08) : 2377 - 2387
  • [13] A framework for facial expression recognition using deep self-attention network
    Indolia S.
    Nigam S.
    Singh R.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (07) : 9543 - 9562
  • [14] Deep Semantic Role Labeling with Self-Attention
    Tan, Zhixing
    Wang, Mingxuan
    Xie, Jun
    Chen, Yidong
    Shi, Xiaodong
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4929 - 4936
  • [15] A novel self-attention deep subspace clustering
    Zhengfan Chen
    Shifei Ding
    Haiwei Hou
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 2377 - 2387
  • [16] Deep CNNs With Self-Attention for Speaker Identification
    Nguyen Nang An
    Nguyen Quang Thanh
    Liu, Yanbing
    IEEE ACCESS, 2019, 7 : 85327 - 85337
  • [17] Compressed Self-Attention for Deep Metric Learning
    Chen, Ziye
    Gong, Mingming
    Xu, Yanwu
    Wang, Chaohui
    Zhang, Kun
    Du, Bo
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3561 - 3568
  • [18] Prosodic Structure Prediction using Deep Self-attention Neural Network
    Du, Yao
    Wu, Zhiyong
    Kang, Shiyin
    Su, Dan
    Yu, Dong
    Meng, Helen
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 320 - 324
  • [19] Deep Multi-Instance Learning with Induced Self-Attention for Medical Image Classification
    Li, Zhenliang
    Yuan, Liming
    Xu, Haixia
    Cheng, Rui
    Wen, Xianbin
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 446 - 450
  • [20] A Study on the Classification of Cancers with Lung Cancer Pathological Images Using Deep Neural Networks and Self-Attention Structures
    Kim, Seung Hyun
    Kang, Ho Chul
    JOURNAL OF POPULATION THERAPEUTICS AND CLINICAL PHARMACOLOGY, 2023, 30 (06): : E374 - E383