On the Integration of Self-Attention and Convolution

被引:169
|
作者
Pan, Xuran [1 ]
Ge, Chunjiang [1 ]
Lu, Rui [1 ]
Song, Shiji [1 ]
Chen, Guanfu [2 ]
Huang, Zeyi [2 ]
Huang, Gao [1 ,3 ]
机构
[1] Tsinghua Univ, Dept Automat, BNRist, Beijing, Peoples R China
[2] Huawei Technol Ltd, Shenzhen, Peoples R China
[3] Beijing Acad Artificial Intelligence, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52688.2022.00089
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolution and self-attention are two powerful techniques for representation learning, and they are usually considered as two peer approaches that are distinct from each other. In this paper, we show that there exists a strong underlying relation between them, in the sense that the bulk of computations of these two paradigms are in fact done with the same operation. Specifically, we first show that a traditional convolution with kernel size k x k can be decomposed into k(2) individual 1 x 1 convolutions, followed by shift and summation operations. Then, we interpret the projections of queries, keys, and values in self-attention module as multiple 1 x 1 convolutions, followed by the computation of attention weights and aggregation of the values. Therefore, the first stage of both two modules comprises the similar operation. More importantly, the first stage contributes a dominant computation complexity (square of the channel size) comparing to the second stage. This observation naturally leads to an elegant integration of these two seemingly distinct paradigms, i.e., a mixed model that enjoys the benefit of both self-Attention and Convolution (ACmix), while having minimum computational overhead compared to the pure convolution or self-attention counterpart. Extensive experiments show that our model achieves consistently improved results over competitive baselines on image recognition and downstream tasks. Code and pre-trained models will be released at https ://github.com/LeapLabTHU/ACmix and https://gitee.com/mindspore/models.
引用
收藏
页码:805 / 815
页数:11
相关论文
共 50 条
  • [41] BLSTM convolution and self-attention network enabled recursive and direct prediction for optical chaos
    Wang, Yangyundou
    Ma, Chen
    Hu, Chuanfei
    Gao, Dawei
    Fan, Yuanlong
    Shao, Xiaopeng
    OPTICS LETTERS, 2024, 49 (12) : 3360 - 3363
  • [42] A deep learning sequence model based on self-attention and convolution for wind power prediction
    Liu, Chien-Liang
    Chang, Tzu-Yu
    Yang, Jie-Si
    Huang, Kai-Bin
    RENEWABLE ENERGY, 2023, 219
  • [43] DCGSA: A global self-attention network with dilated convolution for crowd density map generating
    Zhu, Liping
    Li, Chengyang
    Wang, Bing
    Yuan, Kun
    Yang, Zhongguo
    NEUROCOMPUTING, 2020, 378 : 455 - 466
  • [44] CSatDTA: Prediction of Drug-Target Binding Affinity Using Convolution Model with Self-Attention
    Ghimire, Ashutosh
    Tayara, Hilal
    Xuan, Zhenyu
    Chong, Kil To
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (15)
  • [45] Footprint Pressure Image Retrieval Algorithm Based on Multi-scale Self-attention Convolution
    Zhu M.
    Wang T.
    Wang N.
    Tang J.
    Lu X.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2020, 33 (12): : 1097 - 1103
  • [46] Cross-domain recommendation of overlapping users based on self-attention graph convolution network
    Xing, Xing
    Wang, Shiqi
    Liu, Jiawen
    Li, Yunxi
    Jia, Zhichun
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4984 - 4988
  • [47] A Serial-Parallel Self-Attention Network Joint With Multi-Scale Dilated Convolution
    Gaihua, Wang
    Tianlun, Zhang
    Yingying, Dai
    Jinheng, Lin
    Lei, Cheng
    IEEE ACCESS, 2021, 9 : 71909 - 71919
  • [48] AcFusion: Infrared and Visible Image Fusion Based on Self-Attention and Convolution With Enhanced Information Extraction
    Zhu, Huayi
    Wu, Heshan
    He, Dongmei
    Lan, Rushi
    Liu, Zhenbing
    Pan, Xipeng
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 4155 - 4167
  • [49] Neural network based on convolution and self-attention fusion mechanism for plant leaves disease recognition
    Zhao, Yun
    Li, Yang
    Wu, Na
    Xu, Xing
    CROP PROTECTION, 2024, 180
  • [50] Self-Attention for Cyberbullying Detection
    Pradhan, Ankit
    Yatam, Venu Madhav
    Bera, Padmalochan
    2020 INTERNATIONAL CONFERENCE ON CYBER SITUATIONAL AWARENESS, DATA ANALYTICS AND ASSESSMENT (CYBER SA 2020), 2020,