Short-term Hebbian learning can implement transformer-like attention

被引：2

作者：

Ellwood, Ian T. ^{[1
]}

机构：

[1] Cornell Univ, Dept Neurobiol & Behav, Ithaca, NY 14850 USA

来源：

PLOS COMPUTATIONAL BIOLOGY | 2024年 / 20卷 / 01期

关键词：

APICAL DENDRITES; PYRAMIDAL CELLS; SPIKES; MODEL;

D O I：

10.1371/journal.pcbi.1011843

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Transformers have revolutionized machine learning models of language and vision, but their connection with neuroscience remains tenuous. Built from attention layers, they require a mass comparison of queries and keys that is difficult to perform using traditional neural circuits. Here, we show that neurons can implement attention-like computations using short-term, Hebbian synaptic potentiation. We call our mechanism the match-and-control principle and it proposes that when activity in an axon is synchronous, or matched, with the somatic activity of a neuron that it synapses onto, the synapse can be briefly strongly potentiated, allowing the axon to take over, or control, the activity of the downstream neuron for a short time. In our scheme, the keys and queries are represented as spike trains and comparisons between the two are performed in individual spines allowing for hundreds of key comparisons per query and roughly as many keys and queries as there are neurons in the network. Many of the most impressive recent advances in machine learning, from generating images from text to human-like chatbots, are based on a neural network architecture known as the transformer. Transformers are built from so-called attention layers which perform large numbers of comparisons between the vector outputs of the previous layers, allowing information to flow through the network in a more dynamic way than previous designs. This large number of comparisons is computationally expensive and has no known analogue in the brain. Here, we show that a variation on a learning mechanism familiar in neuroscience, Hebbian learning, can implement a transformer-like attention computation if the synaptic weight changes are large and rapidly induced. We call our method the match-and-control principle and it proposes that when presynaptic and postsynaptic spike trains match up, small groups of synapses can be transiently potentiated allowing a few presynaptic axons to control the activity of a neuron. To demonstrate the principle, we build a model of a pyramidal neuron and use it to illustrate the power and limitations of the idea.

引用

页数：18

共 50 条

[21] Can short-term memory be trained?
Norris, Dennis G.
Hall, Jane
Gathercole, Susan E.
MEMORY & COGNITION, 2019, 47 (05) : 1012 - 1023
[22] Can short-term memory be trained?
Dennis G. Norris
Jane Hall
Susan E. Gathercole
Memory & Cognition, 2019, 47 : 1012 - 1023
[23] ATTENTION IN BISENSORY SIMULTANEOUS SHORT-TERM MEMORY
DORNBUSH, RL
PERCEPTION & PSYCHOPHYSICS, 1970, 7 (04): : 244 - &
[24] Spatial and Temporal Attention-Enabled Transformer Network for Multivariate Short-Term Residential Load Forecasting
Zhao, Hongshan
Wu, Yuchen
Ma, Libo
Pan, Sichao
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[25] Dual-stream transformer-attention fusion network for short-term carbon price prediction
Wu, Han
Du, Pei
ENERGY, 2024, 311
[26] Investor attention and short-term return reversals
Heyman, Dries
Lescrauwaet, Michiel
Stieperaere, Hannes
FINANCE RESEARCH LETTERS, 2019, 29 : 1 - 6
[27] ATTENTION GATING IN SHORT-TERM VISUAL MEMORY
REEVES, A
SPERLING, G
PSYCHOLOGICAL REVIEW, 1986, 93 (02) : 180 - 206
[28] Attention and short-term memory in contrast detection
Tanaka, Y
Sagi, D
VISION RESEARCH, 2000, 40 (09) : 1089 - 1100
[29] Short-term load forecasting based on CEEMDAN and Transformer
Ran, Peng
Dong, Kun
Liu, Xu
Wang, Jing
ELECTRIC POWER SYSTEMS RESEARCH, 2023, 214
[30] Long Short-Term Transformer for Online Action Detection
Xu, Mingze
Xiong, Yuanjun
Chen, Hao
Li, Xinyu
Xia, Wei
Tu, Zhuowen
Soatto, Stefano
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →