FLatten Transformer: Vision Transformer using Focused Linear Attention

被引:66
|
作者
Han, Dongchen [1 ]
Pan, Xuran [1 ]
Han, Yizeng [1 ]
Song, Shiji [1 ]
Huang, Gao [1 ]
机构
[1] Tsinghua Univ, Dept Automat, BNRist, Beijing, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.00548
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks. Linear attention, on the other hand, offers a much more efficient alternative with its linear complexity by approximating the Softmax operation through carefully designed mapping functions. However, current linear attention approaches either suffer from significant performance degradation or introduce additional computation overhead from the mapping functions. In this paper, we propose a novel Focused Linear Attention module to achieve both high efficiency and expressiveness. Specifically, we first analyze the factors contributing to the performance degradation of linear attention from two perspectives: the focus ability and feature diversity. To overcome these limitations, we introduce a simple yet effective mapping function and an efficient rank restoration module to enhance the expressiveness of self-attention while maintaining low computation complexity. Extensive experiments show that our linear attention module is applicable to a variety of advanced vision Transformers, and achieves consistently improved performances on multiple benchmarks. Code is available at https://github. com/LeapLabTHU/FLatten-Transformer.
引用
收藏
页码:5938 / 5948
页数:11
相关论文
共 50 条
  • [1] PARAMETER-EFFICIENT VISION TRANSFORMER WITH LINEAR ATTENTION
    Zhao, Youpeng
    Tang, Huadong
    Jiang, Yingying
    Yong, A.
    Wu, Qiang
    Wang, Jun
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1275 - 1279
  • [2] Vision Transformer with Deformable Attention
    Xia, Zhuofan
    Pan, Xuran
    Song, Shiji
    Li, Li Erran
    Huang, Gao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4784 - 4793
  • [3] Vision Transformer With Quadrangle Attention
    Zhang, Qiming
    Zhang, Jing
    Xu, Yufei
    Tao, Dacheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3608 - 3624
  • [4] Adder Attention for Vision Transformer
    Shu, Han
    Wang, Jiahao
    Chen, Hanting
    Li, Lin
    Yang, Yujiu
    Wang, Yunhe
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] CoAtFormer: Vision Transformer with Composite Attention
    Chang, Zhiyong
    Yin, Mingjun
    Wang, Yan
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 614 - 622
  • [6] CONMW TRANSFORMER: A GENERAL VISION TRANSFORMER BACKBONE WITH MERGED-WINDOW ATTENTION
    Li, Ang
    Jiao, Jichao
    Li, Ning
    Qi, Wangjing
    Xu, Wei
    Pang, Min
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1551 - 1555
  • [7] Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
    Pan, Xuran
    Ye, Tianzhu
    Xia, Zhuofan
    Song, Shiji
    Huang, Gao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2082 - 2091
  • [8] Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
    Wu, Sitong
    Wu, Tianyi
    Tan, Haoru
    Guo, Guodong
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2731 - 2739
  • [9] VITALT: a robust and efficient brain tumor detection system using vision transformer with attention and linear transformation
    Poornam, S.
    Angelina, J. Jane Rubel
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (12): : 6403 - 6419
  • [10] ATTENTION PROBE: VISION TRANSFORMER DISTILLATION IN THE WILD
    Wang, Jiahao
    Cao, Mingdeng
    Shi, Shuwei
    Wu, Baoyuan
    Yang, Yujiu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2220 - 2224