Lifelong Learning With Cycle Memory Networks

被引:15
|
作者
Peng, Jian [1 ,2 ]
Ye, Dingqi [2 ,3 ]
Tang, Bo [4 ]
Lei, Yinjie [5 ]
Liu, Yu [6 ]
Li, Haifeng [2 ,3 ]
机构
[1] Tsinghua Univ, Dept Precis Instrument, Beijing 100084, Peoples R China
[2] Xiangjiang Lab, Changsha 410205, Peoples R China
[3] Cent South Univ, Sch Geosci & Infophys, Changsha 410083, Peoples R China
[4] Mississippi State Univ, Dept Elect & Comp Engn, Starkville, MS 39762 USA
[5] Sichuan Univ, Coll Elect & Informat Engn, Chengdu 610017, Peoples R China
[6] Peking Univ, Sch Earth & Space Sci, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Anterograde forgetting; catastrophic forgetting; complementary learning theory; cycle memory network (CMN); lifelong learning;
D O I
10.1109/TNNLS.2023.3294495
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning from a sequence of tasks for a lifetime is essential for an agent toward artificial general intelligence. Despite the explosion of this research field in recent years, most work focuses on the well-known catastrophic forgetting issue. In contrast, this work aims to explore knowledge-transferable lifelong learning without storing historical data and significant additional computational overhead. We demonstrate that existing data-free frameworks, including regularization-based single-network and structure-based multinetwork frameworks, face a fundamental issue of lifelong learning, named anterograde forgetting, i.e., preserving and transferring memory may inhibit the learning of new knowledge. We attribute it to the fact that the learning network capacity decreases while memorizing historical knowledge and conceptual confusion between the irrelevant old knowledge and the current task. Inspired by the complementary learning theory in neuroscience, we endow artificial neural networks with the ability to continuously learn without forgetting while recalling historical knowledge to facilitate learning new knowledge. Specifically, this work proposes a general framework named cycle memory networks (CMNs). The CMN consists of two individual memory networks to store short- and long-term memories separately to avoid capacity shrinkage and a transfer cell between them. It enables knowledge transfer from the long-term to the short-term memory network to mitigate conceptual confusion. In addition, the memory consolidation mechanism integrates short-term knowledge into the long-term memory network for knowledge accumulation. We demonstrate that the CMN can effectively address the anterograde forgetting on several task-related, task-conflict, class-incremental, and crossdomain benchmarks. Furthermore, we provide extensive ablation studies to verify each framework component. The source codes are available at: https://github.com/GeoX-Lab/CMN.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条