CAPTURING MULTI-RESOLUTION CONTEXT BY DILATED SELF-ATTENTION

被引:6
|
作者
Moritz, Niko [1 ]
Hori, Takaaki [1 ]
Le Roux, Jonathan [1 ]
机构
[1] Mitsubishi Elect Res Labs, Cambridge, MA 02139 USA
关键词
dilated self-attention; transformer; automatic speech recognition; computational complexity;
D O I
10.1109/ICASSP39728.2021.9415001
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Self-attention has become an important and widely used neural network component that helped to establish new state-of-the-art results for various applications, such as machine translation and automatic speech recognition (ASR). However, the computational complexity of self-attention grows quadratically with the input sequence length. This can be particularly problematic for applications such as ASR, where an input sequence generated from an utterance can be relatively long. In this work, we propose a combination of restricted self-attention and a dilation mechanism, which we refer to as dilated self-attention. The restricted self-attention allows attention to neighboring frames of the query at a high resolution, and the dilation mechanism summarizes distant information to allow attending to it with a lower resolution. Different methods for summarizing distant frames are studied, such as subsampling, mean-pooling, and attention-based pooling. ASR results demonstrate substantial improvements compared to restricted self-attention alone, achieving similar results compared to full-sequence based self-attention with a fraction of the computational costs.
引用
下载
收藏
页码:5869 / 5873
页数:5
相关论文
共 50 条
  • [41] High-Resolution Remote Sensing Image Semantic Segmentation via Multiscale Context and Linear Self-Attention
    Yin, Peng
    Zhang, Dongmei
    Han, Wei
    Li, Jiang
    Cheng, Jianmei
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 9174 - 9185
  • [42] Replacing Averaging with More Powerful Self-Attention Mechanism for Multi-Image Super-Resolution
    Zhao, Dingyi
    Zhao, Jiying
    2023 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE, 2023,
  • [43] SSIR: Spatial shuffle multi-head self-attention for Single Image Super-Resolution
    Zhao, Liangliang
    Gao, Junyu
    Deng, Donghu
    Li, Xuelong
    PATTERN RECOGNITION, 2024, 148
  • [44] Capturing Dynamic Interests of Similar Users for POI Recommendation Using Self-Attention Mechanism
    Fan, Xinhua
    Hua, Yixin
    Cao, Yibing
    Zhao, Xinke
    SUSTAINABILITY, 2023, 15 (06)
  • [45] 3DPCTN: Two 3D Local-Object Point-Cloud-Completion Transformer Networks Based on Self-Attention and Multi-Resolution
    Huang, Shuyan
    Yang, Zhijing
    Shi, Yukai
    Tan, Junpeng
    Li, Hao
    Cheng, Yongqiang
    ELECTRONICS, 2022, 11 (09)
  • [46] Gait Recognition Based on Multi-Resolution Regional Shape Context
    Zhai, Yanbo
    Jia, Yulan
    Qi, Chun
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 548 - 552
  • [47] Attention and self-attention in random forests
    Utkin, Lev V.
    Konstantinov, Andrei V.
    Kirpichenko, Stanislav R.
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2023, 12 (03) : 257 - 273
  • [48] Context-aware Self-Attention Networks for Natural Language Processing
    Yang, Baosong
    Wang, Longyue
    Wong, Derek F.
    Shi, Shuming
    Tu, Zhaopeng
    NEUROCOMPUTING, 2021, 458 : 157 - 169
  • [49] Attention and self-attention in random forests
    Lev V. Utkin
    Andrei V. Konstantinov
    Stanislav R. Kirpichenko
    Progress in Artificial Intelligence, 2023, 12 : 257 - 273
  • [50] Context-aware positional representation for self-attention networks q
    Chen, Kehai
    Wang, Rui
    Utiyama, Masao
    Sumita, Eiichiro
    NEUROCOMPUTING, 2021, 451 : 46 - 56