Revisiting Neural Networks for Continual Learning: An Architectural Perspective

被引：0

作者：

Lu, Aojun ^{[1
]}

Feng, Tao ^{[3
]}

Yuan, Hangjie ^{[2
]}

Song, Xiaotian ^{[1
]}

Sun, Yanan ^{[1
]}

机构：

[1] Sichuan Univ, Chengdu, Sichuan, Peoples R China

[2] Tsinghua Univ, Beijing, Peoples R China

[3] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China

来源：

PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Efforts to overcome catastrophic forgetting have primarily centered around developing more effective Continual Learning (CL) methods. In contrast, less attention was devoted to analyzing the role of network architecture design (e.g., network depth, width, and components) in contributing to CL. This paper seeks to bridge this gap between network architecture design and CL, and to present a holistic study on the impact of network architectures on CL. This work considers architecture design at the network scaling level, i.e., width and depth, and also at the network components, i.e., skip connections, global pooling layers, and down-sampling. In both cases, we first derive insights through systematically exploring how architectural designs affect CL. Then, grounded in these insights, we craft a specialized search space for CL and further propose a simple yet effective ArchCraft method to steer a CLfriendly architecture, namely, this method recrafts AlexNet/ResNet into AlexAC/ResAC. Experimental validation across various CL settings and scenarios demonstrates that improved architectures are parameter-efficient, achieving state-of-the-art performance of CL while being 86%, 61%, and 97% more compact in terms of parameters than the naive CL architecture in Task IL and Class IL. Code is available at https://github.com/byyx666/ArchCraft.

引用

页码：4651 / 4659

页数：9

共 50 条

[1] Continual Learning with Neural Networks: A Review
Awasthi, Abhijeet
Sarawagi, Sunita
PROCEEDINGS OF THE 6TH ACM IKDD CODS AND 24TH COMAD, 2019, : 362 - 365
[2] Continual robot learning with constructive neural networks
Grossmann, A
Poli, R
LEARNING ROBOTS, PROCEEDINGS, 1998, 1545 : 95 - 108
[3] Continual Learning Using Bayesian Neural Networks
Li, Honglin
Barnaghi, Payam
Enshaeifare, Shirin
Ganz, Frieder
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (09) : 4243 - 4252
[4] Continual Learning with Sparse Progressive Neural Networks
Ergun, Esra
Toreyin, Behcet Ugur
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
[5] Sparse Progressive Neural Networks for Continual Learning
Ergun, Esra
Toreyin, Behcet Ugur
ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 1463 : 715 - 725
[6] Employing Convolutional Neural Networks for Continual Learning
Jasinski, Marcin
Wozniak, Michal
ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2022, PT I, 2023, 13588 : 288 - 297
[7] Continual lifelong learning with neural networks: A review
Parisi, German I.
Kemker, Ronald
Part, Jose L.
Kanan, Christopher
Wermter, Stefan
NEURAL NETWORKS, 2019, 113 : 54 - 71
[8] A Novel Continual Learning Approach for Competitive Neural Networks
Palomo, Esteban J.
Miguel Ortiz-de-Lazcano-Lobato, Juan
David Fernandez-Rodriguez, Jose
Lopez-Rubio, Ezequiel
Maria Maza-Quiroga, Rosa
BIO-INSPIRED SYSTEMS AND APPLICATIONS: FROM ROBOTICS TO AMBIENT INTELLIGENCE, PT II, 2022, 13259 : 223 - 232
[9] Efficient continual learning in neural networks with embedding regularization
Pomponi, Jary
Scardapane, Simone
Lomonaco, Vincenzo
Uncini, Aurelio
NEUROCOMPUTING, 2020, 397 : 139 - 148
[10] Streaming Graph Neural Networks via Continual Learning
Wang, Junshan
Song, Guojie
Wu, Yi
Wang, Liang
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1515 - 1524

← 1 2 3 4 5 →