A generic shared attention mechanism for various backbone neural networks

被引:1
|
作者
Huang, Zhongzhan [1 ]
Liang, Senwei [2 ]
Liang, Mingfu [3 ]
机构
[1] Sun Yat Sen Univ, Guangzhou 510275, Peoples R China
[2] Purdue Univ, W Lafayette, IN 47906 USA
[3] Northwestern Univ, Evanston, IL 60201 USA
基金
中国国家自然科学基金;
关键词
Layer-wise shared attention mechanism; Parameter sharing; Dense-and-implicit connection; Stable training;
D O I
10.1016/j.neucom.2024.128697
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The self-attention mechanism is crucial for enhancing various backbone neural networks' performance. However, current methods add self-attention modules (SAMs) to each network layer without fully utilizing their potential, resulting in suboptimal performance and higher parameter consumption as network depth increases. In this paper, we reveal an inherent phenomenon: SAMs produce highly correlated attention maps across layers, with an average Pearson correlation coefficient of 0.85. Inspired by this inherent observation, we propose Dense-and-Implicit Attention (DIA), which shares SAMs across layers and uses a long short-term memory module to calibrate and connect these correlated attention maps, improving parameter efficiency. This approach aligns with the neural network's dynamic system perspective. Extensive experiments show DIA consistently enhances various backbones like ResNet, Transformer, and UNet in tasks such as image classification, object detection, and image generation with diffusion models. Our analysis indicates that DIA's effectiveness stems from its dense inter-layer information connections, absent in conventional mechanisms, stabilizing training and providing regularization effects. This paper's insights advance our understanding of attention mechanisms, optimizing them and paving the way for future developments across diverse neural networks.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Shared Attention Amplifies the Neural Processing of Emotional Faces
    Lomoriello, Arianna Schiano
    Sessa, Paola
    Doro, Mattia
    Konvalinka, Ivana
    JOURNAL OF COGNITIVE NEUROSCIENCE, 2022, 34 (06) : 917 - 932
  • [42] An overlay approach for enabling access to dynamically shared backbone GMPLS networks
    Fang, Xiuduan
    Veeraraghavan, Malathi
    McGinley, Mark E.
    Gisiger, Robert W.
    PROCEEDINGS - 16TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS, VOLS 1-3, 2007, : 309 - 314
  • [43] Social orienting in gaze leading: A mechanism for shared attention
    Edwards, Stephen Gareth
    Stephenson, Lisa
    Dalmaso, Mario
    Bayliss, Andrew
    PERCEPTION, 2015, 44 : 81 - 82
  • [44] Social orienting in gaze leading: a mechanism for shared attention
    Edwards, S. Gareth
    Stephenson, Lisa J.
    Dalmaso, Mario
    Bayliss, Andrew P.
    PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2015, 282 (1812) : 208 - 215
  • [45] An evolutionary method for the design of generic neural networks
    Edwards, D
    Brown, K
    Taylor, N
    CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 1769 - 1774
  • [46] CNN-BiLSTM hybrid neural networks with attention mechanism for well log prediction
    Shan, Liqun
    Liu, Yanchang
    Tang, Min
    Yang, Ming
    Bai, Xueyuan
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2021, 205
  • [47] Forecasting Copper Electrorefining Cathode Rejection by Means of Recurrent Neural Networks With Attention Mechanism
    Correa, Pedro Pablo
    Cipriano, Aldo
    Nunez, Felipe
    Salas, Juan Carlos
    Lobel, Hans
    IEEE ACCESS, 2021, 9 : 79080 - 79088
  • [48] Enhancing Multimodal Patterns in Neuroimaging by Siamese Neural Networks with Self-Attention Mechanism
    Arco, Juan E.
    Ortiz, Andres
    Gallego-Molina, Nicolas J.
    Gorriz, Juan M.
    Ramirez, Javier
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2023, 33 (04)
  • [49] A New Siamese Heterogeneous Convolutional Neural Networks Based on Attention Mechanism and Feature Pyramid
    Lu, Zhenyu
    Bian, Yuelou
    Yang, Tingya
    Ge, Quanbo
    Wang, Yuanliang
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (01) : 13 - 24
  • [50] Identification of Apple Leaf Diseases by Improved Deep Convolutional Neural Networks With an Attention Mechanism
    Wang, Peng
    Niu, Tong
    Mao, Yanru
    Zhang, Zhao
    Liu, Bin
    He, Dongjian
    FRONTIERS IN PLANT SCIENCE, 2021, 12