Revisiting Neural Networks for Continual Learning: An Architectural Perspective

被引：0

作者：

Lu, Aojun ^{[1
]}

Feng, Tao ^{[3
]}

Yuan, Hangjie ^{[2
]}

Song, Xiaotian ^{[1
]}

Sun, Yanan ^{[1
]}

机构：

[1] Sichuan Univ, Chengdu, Sichuan, Peoples R China

[2] Tsinghua Univ, Beijing, Peoples R China

[3] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China

来源：

PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Efforts to overcome catastrophic forgetting have primarily centered around developing more effective Continual Learning (CL) methods. In contrast, less attention was devoted to analyzing the role of network architecture design (e.g., network depth, width, and components) in contributing to CL. This paper seeks to bridge this gap between network architecture design and CL, and to present a holistic study on the impact of network architectures on CL. This work considers architecture design at the network scaling level, i.e., width and depth, and also at the network components, i.e., skip connections, global pooling layers, and down-sampling. In both cases, we first derive insights through systematically exploring how architectural designs affect CL. Then, grounded in these insights, we craft a specialized search space for CL and further propose a simple yet effective ArchCraft method to steer a CLfriendly architecture, namely, this method recrafts AlexNet/ResNet into AlexAC/ResAC. Experimental validation across various CL settings and scenarios demonstrates that improved architectures are parameter-efficient, achieving state-of-the-art performance of CL while being 86%, 61%, and 97% more compact in terms of parameters than the naive CL architecture in Task IL and Class IL. Code is available at https://github.com/byyx666/ArchCraft.

引用

页码：4651 / 4659

页数：9

共 50 条

[31] Continual Learning in Recurrent Neural Networks for the Internet of Things: A Stochastic Approach
Filho, Josafat Ribeiro Leal
Kocian, Alexander
Frohlich, Antonio Augusto
Chessa, Stefano
2024 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, ISCC 2024, 2024,
[32] Brain-inspired replay for continual learning with artificial neural networks
Gido M. van de Ven
Hava T. Siegelmann
Andreas S. Tolias
Nature Communications, 11
[33] Brain-inspired replay for continual learning with artificial neural networks
van de Ven, Gido M.
Siegelmann, Hava T.
Tolias, Andreas S.
NATURE COMMUNICATIONS, 2020, 11 (01)
[34] Targeted Data Poisoning Attacks Against Continual Learning Neural Networks
Li, Huayu
Ditzler, Gregory
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[35] Continual learning with attentive recurrent neural networks for temporal data classification
Yin, Shao-Yu
Huang, Yu
Chang, Tien-Yu
Chang, Shih-Fang
Tseng, Vincent S.
NEURAL NETWORKS, 2023, 158 : 171 - 187
[36] Learning from the Past: Continual Meta-Learning with Bayesian Graph Neural Networks
Luo, Yadan
Huang, Zi
Zhang, Zheng
Wang, Ziwei
Baktashmotlagh, Mahsa
Yang, Yang
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5021 - 5028
[37] Learning the Architectural Features That Predict Functional Similarity of Neural Networks
Haber, Adam
Schneidman, Elad
PHYSICAL REVIEW X, 2022, 12 (02)
[38] Enhancing Efficient Continual Learning with Dynamic Structure Development of Spiking Neural Networks
Han, Bing
Zhao, Feifei
Zeng, Yi
Pan, Wenxuan
Shen, Guobin
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2993 - 3001
[39] Similarity-based context aware continual learning for spiking neural networks
Han, Bing
Zhao, Feifei
Li, Yang
Kong, Qingqun
Li, Xianqi
Zeng, Yi
NEURAL NETWORKS, 2025, 184
[40] Gating Mechanism in Deep Neural Networks for Resource-Efficient Continual Learning
Jin, Hyundong
Yun, Kimin
Kim, Eunwoo
IEEE ACCESS, 2022, 10 : 18776 - 18786

← 1 2 3 4 5 →