Spontaneous Temporal Grouping Neural Network for Long-Term Memory Modeling

被引:2
|
作者
Shan, Dongjing [1 ]
Zhang, Xiongwei [1 ]
Zhang, Chao [2 ]
机构
[1] Army Engn Univ, Lab Intelligent Informat Proc, Nanjing 210007, Peoples R China
[2] Peking Univ, Key Lab Machine Percept MOE, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Computer architecture; Logic gates; Microprocessors; Training; Standards; Data models; Task analysis; Long-term memory; recurrent neural network; temporal dependency; temporal grouping; vanishing gradient; RECALL;
D O I
10.1109/TCDS.2021.3050759
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The capacity of long-term memory is an important issue in sequence learning, but it remains challenging for the problems of vanishing gradients or out-of-order dependencies. Inspired by human memory, in which long-term memory is broken into fragments and then can be recalled at appropriate times, we propose a neural network via spontaneous temporal grouping in this article. In the architecture, the segmented layer is used for spontaneous sequence segmentation under guidance of the reset gates which are driven to be sparse in the training process; the cascading layer is used to collect information from the temporal groups, where a filtered long short-term memory with chrono-initialization is proposed to alleviate the gradient vanishing phenomenon, and random skip connections are adopted to capture complex dependencies among the groups. Furthermore, the advantage of our neural architecture in long-term memory is demonstrated via a new measurement method. In experiments, we compare the performance with multiple models on several algorithmic or classification tasks, and both of the sequences with fixed lengths like the MNISTs and with varying lengths like the speech utterances are adopted. The results in different criteria have demonstrated the superiority of our proposed neural network.
引用
收藏
页码:472 / 484
页数:13
相关论文
共 50 条
  • [1] Neural Network Structure for Spatio-Temporal Long-Term Memory
    Vu Anh Nguyen
    Starzyk, Janusz A.
    Goh, Wooi-Boon
    Jachyra, Daniel
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (06) : 971 - 983
  • [2] Language Modeling through Long-Term Memory Network
    Nugaliyadde, Anupiya
    Wong, Kok Wai
    Sohel, Ferdous
    Xie, Hong
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [3] Short- and long-term temporal network prediction based on network memory
    Li Zou
    Alberto Ceria
    Huijuan Wang
    Applied Network Science, 8
  • [4] Short- and long-term temporal network prediction based on network memory
    Zou, Li
    Ceria, Alberto
    Wang, Huijuan
    APPLIED NETWORK SCIENCE, 2023, 8 (01)
  • [5] Neural correlates of long-term associative memory in human temporal cortex
    Konishi, Seiki
    Yamashita, Ken-ichiro
    Hirose, Satoshi
    Kunimatsu, Akira
    Aoki, Shigeki
    Chikazoe, Junichi
    Jimura, Koji
    Masutani, Yoshitaka
    Abe, Osamu
    Ohtomo, Kuni
    Miyashita, Yasushi
    NEUROSCIENCE RESEARCH, 2009, 65 : S236 - S236
  • [6] How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies
    Lin, T
    Horne, BG
    Giles, CL
    NEURAL NETWORKS, 1998, 11 (05) : 861 - 868
  • [7] How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies
    EPSON Palo Alto Laboratory, Palo Alto, CA 94034, United States
    不详
    不详
    不详
    Neural Netw., 5 (861-868):
  • [8] Long-term memory for temporal structure:
    Matthew D. Schulkind
    Memory & Cognition, 1999, 27 (5) : 896 - 906
  • [9] Long short-term memory recurrent neural network for modeling temporal patterns in long-term power forecasting for solar PV facilities: Case study of South Korea
    Jung, Yoonhwa
    Jung, Jaehoon
    Kim, Byungil
    Han, SangUk
    JOURNAL OF CLEANER PRODUCTION, 2020, 250
  • [10] Incremental learning in dynamic environments using neural network with long-term memory
    Tsumori, K
    Ozawa, S
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2583 - 2588