A generic shared attention mechanism for various backbone neural networks

被引：1

作者：

Huang, Zhongzhan ^{[1
]}

Liang, Senwei ^{[2
]}

Liang, Mingfu ^{[3
]}

机构：

[1] Sun Yat Sen Univ, Guangzhou 510275, Peoples R China

[2] Purdue Univ, W Lafayette, IN 47906 USA

[3] Northwestern Univ, Evanston, IL 60201 USA

来源：

NEUROCOMPUTING | 2025年 / 611卷

基金：

中国国家自然科学基金;

关键词：

Layer-wise shared attention mechanism; Parameter sharing; Dense-and-implicit connection; Stable training;

D O I：

10.1016/j.neucom.2024.128697

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The self-attention mechanism is crucial for enhancing various backbone neural networks' performance. However, current methods add self-attention modules (SAMs) to each network layer without fully utilizing their potential, resulting in suboptimal performance and higher parameter consumption as network depth increases. In this paper, we reveal an inherent phenomenon: SAMs produce highly correlated attention maps across layers, with an average Pearson correlation coefficient of 0.85. Inspired by this inherent observation, we propose Dense-and-Implicit Attention (DIA), which shares SAMs across layers and uses a long short-term memory module to calibrate and connect these correlated attention maps, improving parameter efficiency. This approach aligns with the neural network's dynamic system perspective. Extensive experiments show DIA consistently enhances various backbones like ResNet, Transformer, and UNet in tasks such as image classification, object detection, and image generation with diffusion models. Our analysis indicates that DIA's effectiveness stems from its dense inter-layer information connections, absent in conventional mechanisms, stabilizing training and providing regularization effects. This paper's insights advance our understanding of attention mechanisms, optimizing them and paving the way for future developments across diverse neural networks.

引用

页数：14

共 50 条

[41] Shared Attention Amplifies the Neural Processing of Emotional Faces
Lomoriello, Arianna Schiano
Sessa, Paola
Doro, Mattia
Konvalinka, Ivana
JOURNAL OF COGNITIVE NEUROSCIENCE, 2022, 34 (06) : 917 - 932
[42] An overlay approach for enabling access to dynamically shared backbone GMPLS networks
Fang, Xiuduan
Veeraraghavan, Malathi
McGinley, Mark E.
Gisiger, Robert W.
PROCEEDINGS - 16TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS, VOLS 1-3, 2007, : 309 - 314
[43] Social orienting in gaze leading: A mechanism for shared attention
Edwards, Stephen Gareth
Stephenson, Lisa
Dalmaso, Mario
Bayliss, Andrew
PERCEPTION, 2015, 44 : 81 - 82
[44] Social orienting in gaze leading: a mechanism for shared attention
Edwards, S. Gareth
Stephenson, Lisa J.
Dalmaso, Mario
Bayliss, Andrew P.
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2015, 282 (1812) : 208 - 215
[45] An evolutionary method for the design of generic neural networks
Edwards, D
Brown, K
Taylor, N
CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 1769 - 1774
[46] CNN-BiLSTM hybrid neural networks with attention mechanism for well log prediction
Shan, Liqun
Liu, Yanchang
Tang, Min
Yang, Ming
Bai, Xueyuan
JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2021, 205
[47] Forecasting Copper Electrorefining Cathode Rejection by Means of Recurrent Neural Networks With Attention Mechanism
Correa, Pedro Pablo
Cipriano, Aldo
Nunez, Felipe
Salas, Juan Carlos
Lobel, Hans
IEEE ACCESS, 2021, 9 : 79080 - 79088
[48] Enhancing Multimodal Patterns in Neuroimaging by Siamese Neural Networks with Self-Attention Mechanism
Arco, Juan E.
Ortiz, Andres
Gallego-Molina, Nicolas J.
Gorriz, Juan M.
Ramirez, Javier
INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2023, 33 (04)
[49] A New Siamese Heterogeneous Convolutional Neural Networks Based on Attention Mechanism and Feature Pyramid
Lu, Zhenyu
Bian, Yuelou
Yang, Tingya
Ge, Quanbo
Wang, Yuanliang
IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (01) : 13 - 24
[50] Identification of Apple Leaf Diseases by Improved Deep Convolutional Neural Networks With an Attention Mechanism
Wang, Peng
Niu, Tong
Mao, Yanru
Zhang, Zhao
Liu, Bin
He, Dongjian
FRONTIERS IN PLANT SCIENCE, 2021, 12

← 1 2 3 4 5 →