MetaTransformer: deep metagenomic sequencing read classification using self-attention models

被引:6
|
作者
Wichmann, Alexander [1 ]
Buschong, Etienne [1 ]
Mueller, Andre [1 ]
Juenger, Daniel [1 ]
Hildebrandt, Andreas [1 ]
Hankeln, Thomas [2 ]
Schmidt, Bertil [1 ]
机构
[1] Johannes Gutenberg Univ Mainz, Inst Comp Sci, Staudingerweg 9, D-55128 Mainz, Rhineland Palat, Germany
[2] Johannes Gutenberg Univ Mainz, Inst Organ & Mol Evolut iomE, J-J Becher Weg 30A, D-55128 Mainz, Rhineland Palat, Germany
关键词
MICROBIOME; GENOMES;
D O I
10.1093/nargab/lqad082
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2x to 5x speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] A Self-attention Based LSTM Network for Text Classification
    Jing, Ran
    2019 3RD INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2019), 2019, 1207
  • [32] Fake news detection and classification using hybrid BiLSTM and self-attention model
    Asutosh Mohapatra
    Nithin Thota
    P. Prakasam
    Multimedia Tools and Applications, 2022, 81 : 18503 - 18519
  • [33] Web service classification based on self-attention mechanism
    Jia, Zhichun
    Zhang, Zhiying
    Dong, Rui
    Yang, Zhongxuan
    Xing, Xing
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2164 - 2169
  • [34] Quantum self-attention neural networks for text classification
    Li, Guangxi
    Zhao, Xuanqiang
    Wang, Xin
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (04)
  • [35] Lightweight Self-Attention Residual Network for Hyperspectral Classification
    Xia, Jinbiao
    Cui, Ying
    Li, Wenshan
    Wang, Liguo
    Wang, Chao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [36] Quantum self-attention neural networks for text classification
    Guangxi LI
    Xuanqiang ZHAO
    Xin WANG
    ScienceChina(InformationSciences), 2024, 67 (04) : 301 - 313
  • [37] Malware Classification on Imbalanced Data through Self-Attention
    Ding, Yu
    Wang, ShuPeng
    Xing, Jian
    Zhang, XiaoYu
    Qi, ZiSen
    Fu, Ge
    Qiang, Qian
    Sun, HaoLiang
    Zhang, JianYu
    2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 154 - 161
  • [38] Multiple Positional Self-Attention Network for Text Classification
    Dai, Biyun
    Li, Jinlong
    Xu, Ruoyi
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7610 - 7617
  • [39] A Hybrid Lightweight Deep Neural Network Approach for Plant Disease Classification Using Self-Attention Mechanism and Transfer Learning
    Alramli, Thaer Sultan Darweesh
    Tekerek, Adem
    JOURNAL OF AGRICULTURAL SCIENCES-TARIM BILIMLERI DERGISI, 2025, 30 (02): : 392 - 412
  • [40] Deep ConvLSTM With Self-Attention for Human Activity Decoding Using Wearable Sensors
    Singh, Satya P.
    Sharma, Madan Kumar
    Lay-Ekuakille, Aime
    Gangwar, Deepak
    Gupta, Sukrit
    IEEE SENSORS JOURNAL, 2021, 21 (06) : 8575 - 8582