Revisiting Neural Networks for Continual Learning: An Architectural Perspective

被引:0
|
作者
Lu, Aojun [1 ]
Feng, Tao [3 ]
Yuan, Hangjie [2 ]
Song, Xiaotian [1 ]
Sun, Yanan [1 ]
机构
[1] Sichuan Univ, Chengdu, Sichuan, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
[3] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Efforts to overcome catastrophic forgetting have primarily centered around developing more effective Continual Learning (CL) methods. In contrast, less attention was devoted to analyzing the role of network architecture design (e.g., network depth, width, and components) in contributing to CL. This paper seeks to bridge this gap between network architecture design and CL, and to present a holistic study on the impact of network architectures on CL. This work considers architecture design at the network scaling level, i.e., width and depth, and also at the network components, i.e., skip connections, global pooling layers, and down-sampling. In both cases, we first derive insights through systematically exploring how architectural designs affect CL. Then, grounded in these insights, we craft a specialized search space for CL and further propose a simple yet effective ArchCraft method to steer a CLfriendly architecture, namely, this method recrafts AlexNet/ResNet into AlexAC/ResAC. Experimental validation across various CL settings and scenarios demonstrates that improved architectures are parameter-efficient, achieving state-of-the-art performance of CL while being 86%, 61%, and 97% more compact in terms of parameters than the naive CL architecture in Task IL and Class IL. Code is available at https://github.com/byyx666/ArchCraft.
引用
收藏
页码:4651 / 4659
页数:9
相关论文
共 50 条
  • [1] Continual Learning with Neural Networks: A Review
    Awasthi, Abhijeet
    Sarawagi, Sunita
    PROCEEDINGS OF THE 6TH ACM IKDD CODS AND 24TH COMAD, 2019, : 362 - 365
  • [2] Continual robot learning with constructive neural networks
    Grossmann, A
    Poli, R
    LEARNING ROBOTS, PROCEEDINGS, 1998, 1545 : 95 - 108
  • [3] Continual Learning Using Bayesian Neural Networks
    Li, Honglin
    Barnaghi, Payam
    Enshaeifare, Shirin
    Ganz, Frieder
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (09) : 4243 - 4252
  • [4] Continual Learning with Sparse Progressive Neural Networks
    Ergun, Esra
    Toreyin, Behcet Ugur
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [5] Sparse Progressive Neural Networks for Continual Learning
    Ergun, Esra
    Toreyin, Behcet Ugur
    ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 1463 : 715 - 725
  • [6] Employing Convolutional Neural Networks for Continual Learning
    Jasinski, Marcin
    Wozniak, Michal
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2022, PT I, 2023, 13588 : 288 - 297
  • [7] Continual lifelong learning with neural networks: A review
    Parisi, German I.
    Kemker, Ronald
    Part, Jose L.
    Kanan, Christopher
    Wermter, Stefan
    NEURAL NETWORKS, 2019, 113 : 54 - 71
  • [8] A Novel Continual Learning Approach for Competitive Neural Networks
    Palomo, Esteban J.
    Miguel Ortiz-de-Lazcano-Lobato, Juan
    David Fernandez-Rodriguez, Jose
    Lopez-Rubio, Ezequiel
    Maria Maza-Quiroga, Rosa
    BIO-INSPIRED SYSTEMS AND APPLICATIONS: FROM ROBOTICS TO AMBIENT INTELLIGENCE, PT II, 2022, 13259 : 223 - 232
  • [9] Efficient continual learning in neural networks with embedding regularization
    Pomponi, Jary
    Scardapane, Simone
    Lomonaco, Vincenzo
    Uncini, Aurelio
    NEUROCOMPUTING, 2020, 397 : 139 - 148
  • [10] Streaming Graph Neural Networks via Continual Learning
    Wang, Junshan
    Song, Guojie
    Wu, Yi
    Wang, Liang
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1515 - 1524