Short-term Hebbian learning can implement transformer-like attention

被引:2
|
作者
Ellwood, Ian T. [1 ]
机构
[1] Cornell Univ, Dept Neurobiol & Behav, Ithaca, NY 14850 USA
关键词
APICAL DENDRITES; PYRAMIDAL CELLS; SPIKES; MODEL;
D O I
10.1371/journal.pcbi.1011843
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Transformers have revolutionized machine learning models of language and vision, but their connection with neuroscience remains tenuous. Built from attention layers, they require a mass comparison of queries and keys that is difficult to perform using traditional neural circuits. Here, we show that neurons can implement attention-like computations using short-term, Hebbian synaptic potentiation. We call our mechanism the match-and-control principle and it proposes that when activity in an axon is synchronous, or matched, with the somatic activity of a neuron that it synapses onto, the synapse can be briefly strongly potentiated, allowing the axon to take over, or control, the activity of the downstream neuron for a short time. In our scheme, the keys and queries are represented as spike trains and comparisons between the two are performed in individual spines allowing for hundreds of key comparisons per query and roughly as many keys and queries as there are neurons in the network. Many of the most impressive recent advances in machine learning, from generating images from text to human-like chatbots, are based on a neural network architecture known as the transformer. Transformers are built from so-called attention layers which perform large numbers of comparisons between the vector outputs of the previous layers, allowing information to flow through the network in a more dynamic way than previous designs. This large number of comparisons is computationally expensive and has no known analogue in the brain. Here, we show that a variation on a learning mechanism familiar in neuroscience, Hebbian learning, can implement a transformer-like attention computation if the synaptic weight changes are large and rapidly induced. We call our method the match-and-control principle and it proposes that when presynaptic and postsynaptic spike trains match up, small groups of synapses can be transiently potentiated allowing a few presynaptic axons to control the activity of a neuron. To demonstrate the principle, we build a model of a pyramidal neuron and use it to illustrate the power and limitations of the idea.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Long Short-Term Memory Spatial Transformer Network
    Feng, Shiyang
    Chen, Tianyue
    Sun, Hao
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 239 - 242
  • [32] Short-Term Load Forecasting Based on the Transformer Model
    Zhao, Zezheng
    Xia, Chunqiu
    Chi, Lian
    Chang, Xiaomin
    Li, Wei
    Yang, Ting
    Zomaya, Albert Y.
    INFORMATION, 2021, 12 (12)
  • [33] Short-term load forecasting based on CEEMDAN and Transformer
    Ran, Peng
    Dong, Kun
    Liu, Xu
    Wang, Jing
    ELECTRIC POWER SYSTEMS RESEARCH, 2023, 214
  • [34] Attention based Transformer coupled with convoluted neural network for ultra-short- and short-term forecasting of multiple wind farms
    Mulewa, Sachin
    Parmar, Azan M.
    De, Ashoke
    INTERNATIONAL JOURNAL OF GREEN ENERGY, 2024, 21 (06) : 1238 - 1252
  • [35] An attention-based recurrent learning model for short-term travel time prediction
    Chughtai, Jawad-ur-Rehman
    Ul Haq, Irfan
    Muneeb, Muhammad
    PLOS ONE, 2022, 17 (12):
  • [36] A forecast model of short-term wind speed based on the attention mechanism and long short-term memory
    Xing, Wang
    Qi-liang, Wu
    Gui-rong, Tan
    Dai-li, Qian
    Ke, Zhou
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (15) : 45603 - 45623
  • [37] Long short-term temporal fusion transformer for short-term forecasting of limit order book in China markets
    Wu, Yucheng
    Wang, Shuxin
    Fu, Xianghua
    APPLIED INTELLIGENCE, 2024, 54 (24) : 12979 - 13000
  • [38] A forecast model of short-term wind speed based on the attention mechanism and long short-term memory
    Wang Xing
    Wu Qi-liang
    Tan Gui-rong
    Qian Dai-li
    Zhou Ke
    Multimedia Tools and Applications, 2024, 83 : 45603 - 45623
  • [39] Devices to Implement Short-Term Speed Limits in Texas Work Zones
    Theiss, LuAnn
    Finley, Melisa D.
    Trout, Nada D.
    TRANSPORTATION RESEARCH RECORD, 2010, (2169) : 54 - 61
  • [40] LSCAformer: Long and short-term cross-attention-aware transformer for depression recognition from video sequences
    He, Lang
    Li, Zheng
    Tiwari, Prayag
    Zhu, Feng
    Wu, Di
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 98