FLatten Transformer: Vision Transformer using Focused Linear Attention

被引:66
|
作者
Han, Dongchen [1 ]
Pan, Xuran [1 ]
Han, Yizeng [1 ]
Song, Shiji [1 ]
Huang, Gao [1 ]
机构
[1] Tsinghua Univ, Dept Automat, BNRist, Beijing, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.00548
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks. Linear attention, on the other hand, offers a much more efficient alternative with its linear complexity by approximating the Softmax operation through carefully designed mapping functions. However, current linear attention approaches either suffer from significant performance degradation or introduce additional computation overhead from the mapping functions. In this paper, we propose a novel Focused Linear Attention module to achieve both high efficiency and expressiveness. Specifically, we first analyze the factors contributing to the performance degradation of linear attention from two perspectives: the focus ability and feature diversity. To overcome these limitations, we introduce a simple yet effective mapping function and an efficient rank restoration module to enhance the expressiveness of self-attention while maintaining low computation complexity. Extensive experiments show that our linear attention module is applicable to a variety of advanced vision Transformers, and achieves consistently improved performances on multiple benchmarks. Code is available at https://github. com/LeapLabTHU/FLatten-Transformer.
引用
收藏
页码:5938 / 5948
页数:11
相关论文
共 50 条
  • [41] Face Mask Detection using Vision Transformer
    Pandya, Bhavik
    Patel, Darshana
    Yow, Kin-Choong
    2023 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE, 2023,
  • [42] Misophonia Sound Recognition Using Vision Transformer
    Bahmei, B.
    Birmingham, E.
    Arzanpour, S.
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [43] Diabetic Retinopathy Classification using Vision Transformer
    Mutawa, A. M.
    Sruthi, Sai
    2022 6TH EUROPEAN CONFERENCE ON ELECTRICAL ENGINEERING & COMPUTER SCIENCE, ELECS, 2022, : 25 - 30
  • [44] Driver Drowsiness Detection Using Vision Transformer
    Usmani, Shaheen
    Chandwani, Bharat
    Sadhya, Debanjan
    COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT I, 2024, 2009 : 445 - 454
  • [45] Glaucoma Classification using Light Vision Transformer
    Singh P.B.
    Singh P.
    Dev H.
    Tiwari A.
    Batra D.
    Chaurasia B.K.
    EAI Endorsed Transactions on Pervasive Health and Technology, 2023, 9
  • [46] Fall Event Detection using Vision Transformer
    Dey, Ankita
    Rajan, Sreeraman
    Xiao, George
    Lu, Jianping
    2022 IEEE SENSORS, 2022,
  • [47] Pupil Detection Using Hybrid Vision Transformer
    Wang, Li
    Wang, Changyuan
    Zhang, Yu
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (12)
  • [48] Bayesian Transformer Using Disentangled Mask Attention
    Chien, Jen-Tzung
    Huang, Yu-Han
    INTERSPEECH 2022, 2022, : 1761 - 1765
  • [49] A Survey on Vision Transformer
    Han, Kai
    Wang, Yunhe
    Chen, Hanting
    Chen, Xinghao
    Guo, Jianyuan
    Liu, Zhenhua
    Tang, Yehui
    Xiao, An
    Xu, Chunjing
    Xu, Yixing
    Yang, Zhaohui
    Zhang, Yiman
    Tao, Dacheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 87 - 110
  • [50] Peripheral Vision Transformer
    Min, Juhong
    Zhao, Yucheng
    Luo, Chong
    Cho, Minsu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,